Compute Site Reliability Engineer (SRE) Kubernetes

Apple

Not Interested
Bookmark
Report This Job

profile Job Location:

Seattle, OR - USA

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

As a Compute Site Reliability Engineer you will be responsible for maintaining monitoring and improving the reliability scalability and performance of our Kubernetes-based infrastructure. Youll work closely with senior SREs developers and other engineers to ensure high availability and optimize our containerized applications. This is a fantastic opportunity for someone eager to grow their expertise in Kubernetes and cloud-native an SRE at Apple you will:* Operate monitor and triage all aspects of our production and non-production environments.* Design build and implement innovative solutions for previous present and future issues. * Prepare alert handling procedures runbooks and collaborate with other SRE teams.* Participate in on-call rotations to troubleshoot and resolve production issues minimizing downtime.* Automate deployment and orchestration of services into the cloud environment as well as other routine processes.* Actively participate in capacity planning scale testing and disaster recovery exercises.


  • Bachelors Degree in Computer Science an engineering-related field or equivalent related experience.
  • Basic understanding of Kubernetes architecture including Pods Deployments Services and ConfigMaps.
  • Familiarity with Linux systems administration and command-line tools.
  • Experience with scripting languages like Bash Python or Go.
  • Knowledge of monitoring tools such as Prometheus Grafana or similar.
  • Exposure to CI/CD pipelines and DevOps practices.
  • Awareness of containerization.
  • Strong problem-solving skills and a willingness to learn new technologies.
  • Outstanding organizational and communications skills


  • Strong verbal and written communication skills
  • Automation advocate - you truly believe in removing operational load via software.
  • Familiarity with Infrastructure as Code (IaC) tools like Puppet
  • A strong sense of ownership. At the same time youre a great teammate who communicates clearly and transparently - Self-motivated inquisitive and always looking to learn more.
  • Experience managing scaling and troubleshooting Java and Go applications
  • CNCF Kubernetes Administration certification
As a Compute Site Reliability Engineer you will be responsible for maintaining monitoring and improving the reliability scalability and performance of our Kubernetes-based infrastructure. Youll work closely with senior SREs developers and other engineers to ensure high availability and optimize our ...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Ask Siri to name the most successful company in the world and it might respond: Apple. And it's not just out of familial pride. Apple consistently ranks highly in profit, revenue, market capitalization, and consumer cachet. In 2018, the company became the first reach a trillion dollar ... View more

View Profile View Profile