Senior Site Reliability Engineering (SRE) Manager

SWIFT

Not Interested
Bookmark
Report This Job

profile Job Location:

Kuala Lumpur - Malaysia

profile Monthly Salary: Not Disclosed
Posted on: 2 days ago
Vacancies: 1 Vacancy

Job Summary

ABOUT US

Were the worlds leading provider of secure financial messaging services headquartered in Belgium. We are the way the world moves value across borders through cities and overseas. No other organisation can address the scale precision pace and trust that this demands and were proud to support the global economy.

Were unique too. We were established to find a better way for the global financial community to move value a reliable safe and secure approach that the community can trust completely. Were always striving to be better and are constantly evolving in an ever-changing landscape without undermining that trust. Five decades on our vibrant community reflects the complexity and diversity of the financial ecosystem. We innovate diligently test exhaustively then implement a connected and exciting era our mission has never been more relevant. Swift now has a presence in 200 countries and legal territories to serve a community of more than 12000 banks and financial institutions.

About the Role

As a Senior Site Reliability Engineering Manager you will lead a team responsible for the reliability observability and automation of SWIFTs monitoring platform that powers infrastructure network and synthetic monitoring. You will ensure high availability for critical services while driving an automation-first culture. This role requires hands-on experience in troubleshooting complex systems scaling distributed platforms and mentoring a team to own operational excellence.

Key Responsibilities

1. Team Building and Mentorship:

  • Recruit retain and grow engineers with expertise in monitoring observability and automation.
  • Mentor team members on incident response root cause analysis and production troubleshooting.

2. Operational Leadership:

  • Own reliability uptime and performance of monitoring and observability platforms.
  • Lead incident management major incident response and post-incident reviews.
  • Drive automation to reduce manual operational work including runbooks and self-healing systems.

3. Collaboration and Alignment:

  • Partner with Product Owners Engineering Leads and cross-functional teams to align SRE priorities with business impact.
  • Promote transparency visibility and best practices across teams.

4. Technical Leadership:

  • Guide system design architecture and operational best practices for monitoring and observability platforms.
  • Advocate for automation observability and reliability at scale.

5. Continuous Improvement and Innovation:

  • Introduce new monitoring observability and automation tools.
  • Encourage knowledge sharing learning and innovation across teams.

What Will Make You Successful

Professional Skills

  • Strong leadership communication and mentoring skills.
  • Passion for troubleshooting and operational excellence.
  • Hands-on experience with monitoring metrics logging tracing and alerting.
  • Familiarity with Agile DevOps and SRE practices.
  • Fluency in English.

Key Qualifications

  • 8 years in software engineering or operations for large-scale distributed systems.
  • 5 years managing technical teams preferably SRE platform or production engineering.
  • Expertise in monitoring platforms and observability tools (ELK Grafana OpenTelemetry Splunk).
  • Strong automation skills: Infrastructure as code CI/CD for ops scripting (Python Go Bash).
  • Production troubleshooting experience across software stack networks and infrastructure.
  • Large-scale Linux Kubernetes or cloud-native operations experience.
  • Proven ability to manage mission-critical services and drive reliability culture.

Additional Requirements

  • Advocate for automation-first approaches to minimize operational toil.
  • Strong sense of ownership and transparent communication style.
  • Self-motivated curious and proactive in improving systems and processes.

About the Team

Our SRE team tackles high-scale high-impact challenges in monitoring observability and reliability. We value troubleshooting automation-first thinking and operational excellence. Collaboration learning and innovation are core to our culture.

What we offer

We give you a competitive package

We help you perform at your best

We help you make a difference

We give you the freedom to be yourself

We give you the freedom to be yourself. We are creating an environment of unique individuals like you with different perspectives on the financial industry and the world. A diverse and inclusive environment in which everyones voice counts and where you can reach your full potential.

We are committed to an inclusive and accessible recruitment process. If you require a reasonable accommodation related to accessibility during your application or interview please contact or indicate this in your application.

Please note that this mailbox is not monitored for general recruitment enquiries and should only be used for accessibility or accommodation-related requests (for example related to vision hearing or neurodiversity).

All requests are confidential and will not affect your candidacy.

Dont meet every single requirement At Swift we are dedicated to building a workplace where people can bring their full selves and ideas to the team so if you are excited about this role we encourage you to apply even if you do not meet every single qualification.


Required Experience:

Manager

ABOUT USWere the worlds leading provider of secure financial messaging services headquartered in Belgium. We are the way the world moves value across borders through cities and overseas. No other organisation can address the scale precision pace and trust that this demands and were proud to support...
View more view more

Key Skills

  • Lean Manufacturing
  • Six Sigma
  • Food Industry
  • Root cause Analysis
  • SAP
  • CMMS
  • Conflict Management
  • Maintenance Management
  • Maintenance
  • Supplier Management
  • Team Management
  • Programmable Logic Controllers

About Company

Company Logo

SWIFT is a global member-owned cooperative and the world’s leading provider of secure financial messaging services. We provide our community with a platform for messaging and standards for communicating, and we offer products and services to facilitate access and integration, identifi ... View more

View Profile View Profile