We work on Apple scale opportunities and challenges. We are engineers at heart. We like solving technical problems. We believe that a good SRE must be a good software engineer and can code anything which has a logic and pattern to it. We believe a good engineer has the curiosity to dig into inner workings of technology and is always experimenting reading and in constant learning mode. If you are a software engineer with passion to code and dig deeper into any technology love knowing the internals fascinated by distributed systems architecture we want to hear from you. The person should be capable of exhibiting deftness to handle multiple simultaneous competing priorities and deliver solutions in a timely manner. The person will have to participate in 12x7 on-call rotation and provide incident resolution for the production issues in timely manner. The person should be able to understand complex architectures and be comfortable working with different teams
Experience: 3 years in software site reliability engineering or software development roles.
Programming: Proficient in at least one of Python Golang or Java.
Data Structures & Algorithms: Strong foundation and application experience.
Distributed Systems: Solid understanding and hands-on experience managing at least one distributed system (e.g. Kafka Cassandra Hadoop Redis or similar).
Kubernetes: Expertise in Kubernetes ecosystem (deployment configuration monitoring and operation).
Cloud Platforms: Hands-on experience with at least one major cloud platform (AWS Azure or Google Cloud Platform).
KEY RESPONSIBILITIES
Design develop and automate: Build tools frameworks and solutions to improve reliability scalability and efficiency across systems.
Monitor and maintain: Implement advanced monitoring and alerting for cloud and containerized workloads.
Troubleshoot and solve: Respond to and resolve complex production incidents and perform root cause analysis.
Collaborate: Work closely with development and operations teams to integrate reliability best practices throughout the software lifecycle.
Optimize: Proactively recommend improvements in architecture deployment and operations for distributed systems.
Problem Solving: Demonstrated ability to independently troubleshoot and resolve complex technical issues.
Creative Thinking: A track record of proposing and implementing innovative solutions to technical challenges.
Strong communication and collaboration abilities.
Willingness to learn and adapt to new technologies rapidly.
Ownership mindset and accountability for deliverables.
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.