- Improve tools and advocate operational excellence for continuous monitoring self-healing systems and alert transparency
- Own and guide critical incident mitigations and drive blameless postmortems
- Troubleshooting and timely problem-solving of production incidents
- Define key SLIs and help shape business and quality-focused SLOs with PMs
Qualifications :
- Proficiency in Java or another JVM language. Resilient knowledge in PHP Golang or Python is a plus.
- Can-do attitude within high-performing distributed teams.
- Good time-management strong communication and collaboration skills in remote environments.
- Production-proven and resilient SRE expertise.
- Understanding of Cloud and Distributed Systems challenges.
- Leveraging Telemetry Tools enable you to ensure quality and operational excellence.
- An intrinsic drive to drill deep into complex tasks and incidents paired with a natural need to share it with your peers.
Additional Information :
- Possibility to work remotely either partially or on a full-time basis
- International team with flat hierarchies and open communication
- Educational budget to support your continuous learning and development
- Regular knowledge sharing session within team on various topics
- Exciting team events every quarter
Contact
Marija Dimitrova
At AUTO1 Group we live an open culture believe in direct communication and value diversity. We welcome every applicant; regardless of gender ethnic origin religion age sexual identity disability or any other non-merit factor.
#LI-A1
Remote Work :
Yes
Employment Type :
Full-time