Manager Site Reliability Engineering

Sabre

Job Location:

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Powering the agentic revolution in travel. Sabre is an AI-native technology leader backed by one of the worlds largest travel data clouds. Built on an open modular cloud-native architecture Sabre serves as the backbone for both established leaders and bold new disruptors guiding them to the next age of travel retailing through intelligent connected and personalized experiences. With AI at its core and operating at unparalleled scale Sabre transforms insights into innovation empowering airlines hoteliers agencies and other partners to retail distribute and fulfill travel worldwide.

About Sabre

Powering the agentic revolution in is an AInative technology leader backed by one of the worlds largest travel data clouds. Built on an open modular cloudnative architecture Sabre serves as the backbone for both established leaders and bold new disruptorsguiding them to the next age of travel retailing through intelligent connected and personalized experiences. With AI at its core and operating at unparalleled scale Sabre transforms insights into innovation empowering airlines hoteliers agencies and other partners to retail distribute and fulfil travel worldwide.

This role requires a strong blend ofpeople leadership stakeholder management technical depth and communication excellenceto deliver reliable platforms and measurable business outcomes.

Team Description

TheConnectivity SREteam is responsible for thereliability availability performance and cost efficiencyof missioncritical connectivity platforms operating acrosshybrid and cloud environments (GCP: GKE/GCE). The team partners closely with Engineering Product Network/Infrastructure Security Capacity and external vendors to ensure resilient services that support Sabres core business.

Role Summary

As aSite Reliability Engineering Manager you will lead aglobally distributed SRE teamresponsible for the reliability and operational excellence of missioncritical connectivity platforms and applications. You will balance people leadership with operational ownership technical oversight and crossregional collaboration.

This is a hands-on leadership role focused onreliability engineering and SRE maturity owningoncall strategy incident leadership SLO/error budgets disaster recovery readiness observability toil reduction security compliance and cost optimization while driving crossfunctional execution and continuous improvement.

Key Responsibilities

Ownproduction reliabilityfor connectivity services includingSLO and errorbudget management proactive production health monitoring and continuous improvement.
Lead24x7 oncall operationsandmajor incident response including rotation design escalation paths incident leadership and blameless postincident reviews.
Ownoperational execution and work intake including prioritization assignment and tracking of work items (e.g. Jira/Rally) to ensure timely and reliable delivery.
Ensure systems aresecure compliant and resilient including OS/platform patching vulnerability remediation configuration compliance andPCI audit readiness in partnership with Security and Compliance teams.
Maintaindisaster recovery readiness including RTO/RPO posture testing cadence and remediation of identified DR gaps.
DriveSRE best practices including observability (metrics logs traces) alert hygiene automation toil reduction and standardized runbooks and readiness reviews.
Ownproduction change governance including review and approval of changes (e.g. ServiceNow) ensuring appropriate risk assessment rollback plans and crossteam coordination to prevent production impact.
Collaborate with engineering teams toembed reliability by designinto architectures releases and change management practices.
Lead coach and develop aglobally distributed SRE team establishing clear ownership models supporting career growth and fostering a culture of accountability and continuous learning.
Act as the primary SRE partner forEngineering Product Network/Infrastructure Security Capacity and key vendors driving crossfunctional initiatives such as modernization efforts DR drills observability improvements and cost/capacity optimization.

Qualifications

Required

8-10 years of experience in SRE DevOps or Infrastructure Engineering roles;
3 years of experience as a people manager leading engineers or SRE teams.
Proven experience running 247 largescale production systems with strong incident management and oncall leadership.
Handson experience with GCP (or other major cloud) Kubernetes/GKE Linux and networking fundamentals.
Strong depth in monitoring and observability (e.g. Grafana Splunk AppDynamics or equivalents) and reliability governance (SLOs error budgets).
Strong stakeholder management skills with ability to communicate clearly with senior engineering and business partners.
Bachelors degree in Computer Science Engineering or equivalent experience.

Preferred

Experience leading cloud migrations and platform modernization initiatives.
Demonstrated outcomes in cost optimization and capacity planning.
Familiarity with CI/CD pipelines and changerisk controls in highavailability or regulated environments.
Experience supporting security compliance and audit requirements (e.g. PCI).
Experience leading or collaborating with globally distributed teams across multiple time zones

We will give careful consideration to your application and review your details against the position criteria. You will receive separate notification as your application progresses.

Please note that only candidates who meet the minimum criteria for the role will proceed in the selection process.

#LI-Hybrid#LI-BG1

Required Experience:

Manager

Powering the agentic revolution in travel. Sabre is an AI-native technology leader backed by one of the worlds largest travel data clouds. Built on an open modular cloud-native architecture Sabre serves as the backbone for both established leaders and bold new disruptors guiding them to the next age...

About Sabre

This role requires a strong blend ofpeople leadership stakeholder management technical depth and communication excellenceto deliver reliable platforms and measurable business outcomes.

Team Description

Role Summary

Key Responsibilities

Ownproduction reliabilityfor connectivity services includingSLO and errorbudget management proactive production health monitoring and continuous improvement.
Lead24x7 oncall operationsandmajor incident response including rotation design escalation paths incident leadership and blameless postincident reviews.
Ownoperational execution and work intake including prioritization assignment and tracking of work items (e.g. Jira/Rally) to ensure timely and reliable delivery.
Ensure systems aresecure compliant and resilient including OS/platform patching vulnerability remediation configuration compliance andPCI audit readiness in partnership with Security and Compliance teams.
Maintaindisaster recovery readiness including RTO/RPO posture testing cadence and remediation of identified DR gaps.
DriveSRE best practices including observability (metrics logs traces) alert hygiene automation toil reduction and standardized runbooks and readiness reviews.
Ownproduction change governance including review and approval of changes (e.g. ServiceNow) ensuring appropriate risk assessment rollback plans and crossteam coordination to prevent production impact.
Collaborate with engineering teams toembed reliability by designinto architectures releases and change management practices.
Lead coach and develop aglobally distributed SRE team establishing clear ownership models supporting career growth and fostering a culture of accountability and continuous learning.
Act as the primary SRE partner forEngineering Product Network/Infrastructure Security Capacity and key vendors driving crossfunctional initiatives such as modernization efforts DR drills observability improvements and cost/capacity optimization.

Qualifications

Required

8-10 years of experience in SRE DevOps or Infrastructure Engineering roles;
3 years of experience as a people manager leading engineers or SRE teams.
Proven experience running 247 largescale production systems with strong incident management and oncall leadership.
Handson experience with GCP (or other major cloud) Kubernetes/GKE Linux and networking fundamentals.
Strong depth in monitoring and observability (e.g. Grafana Splunk AppDynamics or equivalents) and reliability governance (SLOs error budgets).
Strong stakeholder management skills with ability to communicate clearly with senior engineering and business partners.
Bachelors degree in Computer Science Engineering or equivalent experience.

Preferred

Experience leading cloud migrations and platform modernization initiatives.
Demonstrated outcomes in cost optimization and capacity planning.
Familiarity with CI/CD pipelines and changerisk controls in highavailability or regulated environments.
Experience supporting security compliance and audit requirements (e.g. PCI).
Experience leading or collaborating with globally distributed teams across multiple time zones

We will give careful consideration to your application and review your details against the position criteria. You will receive separate notification as your application progresses.

Please note that only candidates who meet the minimum criteria for the role will proceed in the selection process.

#LI-Hybrid#LI-BG1

Required Experience:

Manager

Key Skills

Apply Now

About Company

Sabre

Sabre Corporation is a travel technology company based in Southlake, Texas. It is the largest Global Distribution Systems provider for air bookings in North America. American Airlines founded the company in 1960, and it was spun off in 2000.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click