drjobs Site Reliability Engineering (SRE) - Apple Data Platform

Site Reliability Engineering (SRE) - Apple Data Platform

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Austin - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Apple Services infrastructure is planetary scale. Our Data Platform Site Reliability Engineering team manages the infrastructure and applications on bare-metal and cloud computing platforms to deliver data processing governance and storage for many of Apples global products and organizations. Our platform teams work with exabytes of data terabytes of memory and hundreds of thousands of jobs running millions of executors to support predicable and performant data analytics. Our platform enables key features in Apple Music TV Maps News and other world class products. Ensuring all of these technologies in geographically distributed data centers work together in harmony presents unique challenges. As an SRE at Apple youll need to solve problems that arise using empirical data teamwork and your own unique Platform Services SREs work directly with our partner engineering teams tightly collaborating with the software developers to deliver seamless experiences for our customers. We run a mix of open source vendor licensed and proprietary tools which you will use and have opportunities to improve upon. The cross functional team collaborates to ensure we apply a consistent incident management process across all data platform services and provide user journey based SLOs derived from exhaustive observability metrics high availability architecture and automation for deployments. We think critically and strive to balance long-term optimal solutions with the business priorities for each engineering challenge we face. Good ideas are heard and results are rewarded.


  • BS/MS in Computer Science or Equivalent
  • 5 years of software development or production operations experience in a large-scale environment
  • Proficiency in authoring and releasing code in Go Python or Java using common configuration management and software delivery platforms
  • Experience operating production applications at scale including well designed performance testing HA and disaster recovery concepts capacity planning and managing distributed systems on internal and public cloud infrastructure principally Kubernetes
  • Understanding of the Linux Operating System containers and virtualization standard networking protocols and components
  • Strong sense of ownership and integrity demonstrated through clear communication and collaboration
  • Demonstrates excellent troubleshooting and problem solving skills using the scientific method


  • Proficiency with the architecture deployment performance tuning and troubleshooting of open source data analytics and data governance technologies especially Apache Spark Flink Hive Hadoop/HDFS or other related software
  • The successful candidate is frustrated with toil and has an acute drive to both automate manual operations and evolve them into automatic processes

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.