Site Reliability Engineer, Observability, London,

London - UK

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Apple Services Engineering infrastructure is BIG. Operating at our scale across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique an SRE at Apple youll need to solve these problems using data teamwork and your own expertise. SREs at Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management our responsibilities are both broad and runs the majority of its systems on Linux. We run a mix of open source vendor licensed and internally developed tools to perform functions such as system configuration management provisioning software deployment logging and learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with the development teams we support to deliver the best results for think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are are looking for passionate and talented Site Reliability Engineers to continue our focus in providing our customers the highest quality Apple Services experience. Our services have to scale globally stay highly available and just work. If you love designing engineering and running systems and infrastructure that will help millions of customers then this is the place for you!The Observability SRE organization is specifically tasked with enabling other teams to better understand their infrastructure and services providing world-class observability Apple services up and running 100% of the time is a challenging job. Accurately monitoring the health of every application and infrastructure that comprises the Apple ecosystem 100% of the time is an order of magnitude more challenging.

Proven experience developing production-grade software in Python Go or Java and strong understanding of the Linux operating system and TCP/IP suite of networking protocols
Strong sense of ownership and integrity demonstrated through clear communication and collaboration
Experience and confidence around incident response and incident management
Experience/knowledge in managing and scaling distributed systems in a public private or hybrid cloud environment

Bare metal management experience and experience with deploying supporting and monitoring new and existing services platforms and application stacks.
Demonstrated ability to investigate complex systemic and latent reliability issues and collaborate cross-functionally with software and systems teams to implement sustainable solutions.
Experience automating workflows and reducing operational toil through scalable solutions.
Monitoring of systems and services optimization of performance and resource utilization.

Proven experience developing production-grade software in Python Go or Java and strong understanding of the Linux operating system and TCP/IP suite of networking protocols
Strong sense of ownership and integrity demonstrated through clear communication and collaboration
Experience and confidence around incident response and incident management
Experience/knowledge in managing and scaling distributed systems in a public private or hybrid cloud environment

Bare metal management experience and experience with deploying supporting and monitoring new and existing services platforms and application stacks.
Demonstrated ability to investigate complex systemic and latent reliability issues and collaborate cross-functionally with software and systems teams to implement sustainable solutions.
Experience automating workflows and reducing operational toil through scalable solutions.
Monitoring of systems and services optimization of performance and resource utilization.

Key Skills

Kubernetes
FMEA
Continuous Improvement
Elasticsearch
Go
Root cause Analysis
Maximo
CMMS
Maintenance
Mechanical Engineering
Manufacturing
Troubleshooting

Apply Now

About Company

Apple

Ask Siri to name the most successful company in the world and it might respond: Apple. And it's not just out of familial pride. Apple consistently ranks highly in profit, revenue, market capitalization, and consumer cachet. In 2018, the company became the first reach a trillion dollar ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click