Apple Services scale is BIG. Operating at our scale across multiple geographies and servicing hundreds of millions of users presents unique challenges. As a Software Developer in SRE at Apple youll need to solve these problems using data teamwork and your own expertise. ASE Products Site Reliability teams are responsible for the reliability and performance of the server software stack that powers products like iCloud Photos Mail Drive Backup and many more. We do that by focusing on reliability best practices from service inception to production collaborating deeply with product development teams to deliver a superlative product and shared vision while leveraging data and automation as first principles. We run a mix of open source vendor licensed and internally developed tools to manage the end to end SDLC of our products. Youll learn these tools and have opportunities to improve them. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.
- 5 years of software development or production operations experience in a large-scale environment
- BS or MS in Computer Science or related field
- Experience in managing and scaling large distributed systems in a public private or hybrid cloud environment
- An inherent bias for action strong sense of ownership and integrity demonstrated through clear communication and collaboration
- Experience with deploying and supporting new and existing services platforms and application stacks
- Familiarity with cloud infrastructure concepts (zones regions VPCs etc)
- Excellent troubleshooting and problem solving skills
- Skills and experience in monitoring alerting fault analysis and automation
- The ability to design author and release code in languages like Java Go or Python
- Ability to participate in on call service support
- Lead incident response and root cause analysis of production systems
- Fast learner who is generous with their knowledge
- Experience with disaster recovery capacity planning and chaos testing
- Being curious about how systems work and more importantly how they fail
- An acute drive to build bots that automate away repetitive tasks
- Working knowledge of microservices architecture and container orchestration with Kubernetes or similar technologies preferably in a large-scale production environment
- Experience with managing large numbers of diverse systems with configuration management and software delivery platforms (such as Spinnaker Terraform Puppet Chef or Ansible) in a public private or hybrid cloud environment
- Experience with Linux/Unix Networking Systems Management Systems Security
- Experience using modern object storage systems like S3 GCS.
- Familiarity with large-scale observability systems like Prometheus Grafana Splunk
- A track record of partnering with peers to foster solid engineering principles
- Strong belief in acquiring and spreading knowledge via mentorship
Required Experience:
Senior IC
Apple Services scale is BIG. Operating at our scale across multiple geographies and servicing hundreds of millions of users presents unique challenges. As a Software Developer in SRE at Apple youll need to solve these problems using data teamwork and your own expertise. ASE Products Site Reliability...
Apple Services scale is BIG. Operating at our scale across multiple geographies and servicing hundreds of millions of users presents unique challenges. As a Software Developer in SRE at Apple youll need to solve these problems using data teamwork and your own expertise. ASE Products Site Reliability teams are responsible for the reliability and performance of the server software stack that powers products like iCloud Photos Mail Drive Backup and many more. We do that by focusing on reliability best practices from service inception to production collaborating deeply with product development teams to deliver a superlative product and shared vision while leveraging data and automation as first principles. We run a mix of open source vendor licensed and internally developed tools to manage the end to end SDLC of our products. Youll learn these tools and have opportunities to improve them. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.
- 5 years of software development or production operations experience in a large-scale environment
- BS or MS in Computer Science or related field
- Experience in managing and scaling large distributed systems in a public private or hybrid cloud environment
- An inherent bias for action strong sense of ownership and integrity demonstrated through clear communication and collaboration
- Experience with deploying and supporting new and existing services platforms and application stacks
- Familiarity with cloud infrastructure concepts (zones regions VPCs etc)
- Excellent troubleshooting and problem solving skills
- Skills and experience in monitoring alerting fault analysis and automation
- The ability to design author and release code in languages like Java Go or Python
- Ability to participate in on call service support
- Lead incident response and root cause analysis of production systems
- Fast learner who is generous with their knowledge
- Experience with disaster recovery capacity planning and chaos testing
- Being curious about how systems work and more importantly how they fail
- An acute drive to build bots that automate away repetitive tasks
- Working knowledge of microservices architecture and container orchestration with Kubernetes or similar technologies preferably in a large-scale production environment
- Experience with managing large numbers of diverse systems with configuration management and software delivery platforms (such as Spinnaker Terraform Puppet Chef or Ansible) in a public private or hybrid cloud environment
- Experience with Linux/Unix Networking Systems Management Systems Security
- Experience using modern object storage systems like S3 GCS.
- Familiarity with large-scale observability systems like Prometheus Grafana Splunk
- A track record of partnering with peers to foster solid engineering principles
- Strong belief in acquiring and spreading knowledge via mentorship
Required Experience:
Senior IC
View more
View less