drjobs Manager, Incident Response and Service Reliability

Manager, Incident Response and Service Reliability

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

New York City, NY - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

The Product Operations team empowers Apple teams to execute at scale. We tackle complex organizational technical and operational challenges to ensure seamless execution and strategic alignment across Apple the manager for the Incident Response and Service Reliability Team you will lead the team responsible for Apple Wallets real-time incident response program. You will define and operate the processes for detecting triaging prioritizing and mitigating service-impacting incidents. You will drive the proactive identification of recurring issues lead root cause analysis and partner with engineering to implement long-term fixes that reduce risk and improve reliability. Through close collaboration with engineering infrastructure SRE and product teams you will ensure that incidents are handled with urgency communication is clear and issues are addressed at the root.


  • Bachelors degree or equivalent practical experience.
  • 8 years of experience in incident management technical program management or SRE/infra leadership roles.
  • Demonstrated experience building or scaling an incident management program in a production or customer-facing environment.
  • Proven ability to define measure and influence operational metrics (MTTD MTTR etc.).
  • Strong cross-functional collaboration skills particularly with engineering product and executive stakeholders.
  • Excellent communication skills under pressure with the ability to drive clarity and urgency.
  • Experience with incident tooling (e.g. PagerDuty Opsgenie Slack bots observability platforms).


  • Experience working in payments banking or other financial services companies in a developer role (SRE DevOps or other engineering experience).
  • Experience leading incident programs across global teams or regulated environments.
  • Background in high-availability systems payments infrastructure or customer-critical services.
  • Familiarity with root cause analysis frameworks postmortem facilitation and chaos testing.
  • Experience integrating incident workflows with observability and BI platforms (e.g. Datadog Grafana Tableau).
  • Experience driving change in cross-functional or matrixed organizations.

Required Experience:

Manager

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.