Senior Manager Site Reliability Engineer

Bank Of Montreal

Not Interested
Bookmark
Report This Job

profile Job Location:

Toronto - Canada

profile Monthly Salary: $ 94600 - 176000
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Application Deadline:

09/29/2025

Address:

33 Dundas Street West

Job Family Group:

Technology

We are seeking a seasoned Site Reliability Engineer (SRE) to join our innovative and vibrant team which is at the forefront of solutions that revolutionize how clients engage within the financial space. This person will design how code is deployed configured and monitored as well as the availability latency change management emergency response and management capacity of services in production. Helps teams to determine what new features can be incorporated and when by using service-level agreements (SLAs) to define the required reliability of the system through service-level indicators (SLI) and service-level objectives (SLO). Applies software engineering to automate IT operations tasks - e.g. production system management change management incident response and emergency response. Acts as a link between the development and operations teams. Applies expertise to conduct chaos tests and performance test for critical business requirements.

The ideal candidate will be a proactive solution-oriented individual contributor who thrives working in a dynamic environment. A crucial role in managing and optimizing our production and development environments implementing automation strategies to streamline operations and collaborating closely with our infrastructure teams to enhance product reliability. Were looking for a visionary thinker capable of understanding both the technical and business implications of their work. Your ability to communicate effectively with stakeholders at all levels will be key to our collective success.

  • Oversee and enhance our infrastructure ensuring high availability scalability security and fault tolerance.
  • Design develop and maintain reliable and scalable systems that support the BMOs platforms.
  • Collaborate with teams to improve system architecture performance and reliability.
  • Automate processes to monitor manage and deploy various platform and supporting systems.
  • Conduct system capacity planning and performance analysis to identify bottlenecks optimize system performance and manage costs
  • Implement and maintain monitoring and alerting systems to proactively identify and address potential issues. Respond to and resolve incidents and outages in a timely manner ensuring minimal disruption
  • Conduct post-incident reviews to identify root causes and implement preventive measures.
  • Ensure compliance with security best practices and implement measures to protect data and systems.
  • Helps the development and operations teams establish Service level indicators (SLIs) Service level objectives (SLOs) and Error budgets.
  • Performs automation to increase efficiency and decrease risk like log analysis performance tuning patch application testing of production settings incident response and post-mortem analysis.
  • Supports in system design consulting platform management and capacity planning.
  • Debugs production issues across services and levels of the technology stack.
  • Improves service health visibility by recording metrics logs and traces across all services in order to pinpoint the reasons of an incident.
  • Computes the cost of SLA breaches and assists management in calculating the impact of system reliability. Helps development and operations teams understand the cost of downtime.
  • Operates at a group/enterprise-wide level and serves as a specialist resource to senior leaders and stakeholders.
  • Applies expertise and thinks creatively to address unique or ambiguous situations and to find solutions to problems that can be complex and non-routine.
  • Implements changes in response to shifting trends.
  • Create consolidated dashboards for collected metrics that will help upper management to track performance improvements
  • Broader work or accountabilities may be assigned as needed.

Qualifications

  • Experience with full instrumentation of monitoring tools such as Dynatrace Splunk and CloudWatch
  • Understanding of operating systems like Linux mainframes and deep understanding of databases
  • Experience conducting Post-Incident reviews and enabling mitigation /resolution plans
  • Proficiency in at least one coding language. Python Java Ruby PowerShell JavaScript.
  • Familiar with CI/CD pipelines in ADO and AWS
  • Experience with cloud-native applications and containerization
  • Cybersecurity and privacy concepts principles and solutions.
  • Emotional agility.

Advanced level of proficiency:

  • IT infrastructure library.
  • Robot Process Automation.
  • Cloud Computing.
  • Experience with deployment automation tools like Terraform Packer and Ansible.
  • Expertise in log aggregation and system monitoring tools (Datadog CloudWatch Prometheus Grafana).
  • Knowledge in security monitoring and incident response tools
  • Proficiency in containerization of applications and expertise in managing containerized environments.
  • System Design and Implementation.
  • Incident management.
  • Learning Agility.
  • Building and managing relationships.
  • API Management.
  • Automation and Automation Pipelines.
  • Automated Testing.
  • Quality Assurance and Control.
  • Verbal & written communication skills.
  • Analytical and problem solving skills.
  • Collaboration & team skills; with a focus on cross-group collaboration.
  • Able to manage ambiguity.
  • Data driven decision making.
  • Typically 7 years of relevant experience and post-secondary degree in related field of study or an equivalent combination of education and experience.
  • Seasoned professional with a combination of education experience and industry knowledge.

Salary:

$94600.00 - $176000.00

Pay Type:

Salaried

The above represents BMO Financial Groups pay range and type.

Salaries will vary based on factors such as location skills experience education and qualifications for the role and may include a commission structure. Salaries for part-time roles will be pro-rated based on number of hours regularly worked. For commission roles the salary listed above represents BMO Financial Groups expected target for the first year in this position.

BMO Financial Groups total compensation package will vary based on the pay type of the position and may include performance-based incentives discretionary bonuses as well as other perks and rewards. BMO also offers health insurance tuition reimbursement accident and life insurance and retirement savings plans. To view more details of our benefits please visit: Us

At BMO we are driven by a shared Purpose: Boldly Grow the Good in business and life. It calls on us to create lasting positive change for our customers our communities and our people. By working together innovating and pushing boundaries we transform lives and businesses and power economic growth around the world.

As a member of the BMO team you are valued respected and heard and you have more ways to grow and make an impact. We strive to help you make an impact from day one for yourself and our customers. Well support you with the tools and resources you need to reach new milestones as you help our customers reach theirs. From in-depth training and coaching to manager support and network-building opportunities well help you gain valuable experience and broaden your skillset.

To find out more visit us at is committed to an inclusive equitable and accessible workplace. By learning from each others differences we gain strength through our people and our perspectives. Accommodations are available on request for candidates taking part in all aspects of the selection process. To request accommodation please contact your recruiter.

Note to Recruiters: BMO does not accept unsolicited resumes from any source other than directly from a candidate. Any unsolicited resumes sent to BMO directly or indirectly will be considered BMO property. BMO will not pay a fee for any placement resulting from the receipt of an unsolicited resume. A recruiting agency must first have a valid written and fully executed agency agreement contract for service to submit resumes.


Required Experience:

Senior Manager

Application Deadline:09/29/2025Address:33 Dundas Street WestJob Family Group:TechnologyWe are seeking a seasoned Site Reliability Engineer (SRE) to join our innovative and vibrant team which is at the forefront of solutions that revolutionize how clients engage within the financial space. This perso...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

We cover the whole balance sheet, from foreign exchange, trade finance and treasury management to corporate lending, securitization, public and private debt and equity underwriting. Our team of experts can also provide a full range of advisory services, along with industry-leading res ... View more

View Profile View Profile