What are we looking for
As a Staff Site Reliability Engineer you will be a technical leader within the SRE organization responsible for setting the technical direction and driving the long-term reliability vision for SentinelOnes production service. You will be empowered to solve systemic cross-team challenges and improve the reliability scalability and performance of our entire service ecosystem. You will not just contribute to major initiatives like our Monitoring and Observability Uplift and Logging Pipeline modernization; you will be instrumental in leading the strategy and architecture for these large-scale projects ensuring they meet the long-term needs of the business.
SRE organizations mission at SentinelOne (S1) is to keep our uptime promise to our customers by ensuring we meet our SLOs/SLAs help our engineering teams ship software to our customers fast and with quality and ensure our customers are successful. You will join the Core SRE team at S1 and have an amazing opportunity to drive outcomes that improve reliability stability and cost efficiency of S1s Singularity Platform our largest customer facing service which has over 15000 B2B/B2G customers deployed across over 6 regions and 2 cloud service providers.
What will you do
As a Staff SRE you will be a key technical leader strategist and mentor. You will operate across teams to solve the most challenging reliability and scalability problems at SentinelOne.
Your responsibilities will include:
- Setting the technical direction for reliability across multiple services partnering with engineering leaders to create and execute long-term roadmaps.
- Identifying and eliminating entire classes of operational work by designing and building scalable automated platforms for use by all of SRE and Engineering.
- Leading post-mortems for major multi-system incidents and owning the strategic follow-up to address systemic root causes across the organization.
- Mentoring and developing senior engineers within the SRE organization acting as a force multiplier to level up the entire team.
- You will join a like minded team of SREs who help run our operations smoothly at scale by building a platform on which S1s services can run. If the thought of running a large scale cybersecurity platform on various cloud providers and air gapped environments excite you youve found the right place!
- As a team we value good written communication skills data driven decisions and a keen eye for continuous improvements. Youll help simplify have a passion for new ideas and know how to execute iteratively towards the final goal. We value candor and collaboration.
What skills and knowledge should you bring
- An extensive and proven track record in SRE/DevOps with deep experience leading large cross-functional technical projects from inception to completion.
- Expertise across multiple cloud providers (AWS GCP Azure) with proven experience designing running and troubleshooting highly-available systems in complex multi-cloud and air-gapped environments.
- Great proficiency in one or more mainstream languages (e.g. Go Python) with demonstrable experience building scalable software and automation platforms.
- Strong Production experience with orchestration systems like Kubernetes Nomad or Mesos (We are a Kubernetes shop)
- Proven ability to set technical direction and influence the roadmap of multiple engineering teams without direct authority.
- Experience with SecOps & Compliance processes and their touch points with SRE is desired
- Polyglot experience with other SRE tools we integrate with more tools every day
Apart from the above technical skills following soft skills are required:
- A strong sense of business acumen and the ability to evaluate technical decisions in the context of cost risk and long-term company strategy
- Demonstrated experience in mentoring and growing senior engineers.
- Exceptional communication skills with the ability to articulate complex technical concepts to diverse audiences from junior engineers to executive leadership.
- Curiosity fast-learning pursuit to improvements great communication
- Ability to work in a diverse and distributed team
- A self-starter that is passionate and motivated by new technologies and has empathy for legacy systems
- A quick learner that can navigate through unfamiliar programming languages systems and processes
Why Us
Join a cutting-edge company tackling extraordinary challenges alongside top industry talent. Enjoy flexible hybrid work in Prague (Karlin) Brno (Clubco) or remotely across CZ/SK. Only Prague-based employees are required to work from the office at least two days per week.
Competitive Benefits Package:
- Stock & Bonuses:Grant of Restricted Stock Units with a 4-year vesting plan annual performance-based bonuses and an employee stock purchase plan.
- Time Off & Well-being:Flexible Time Off on top of the standard 5 weeks vacation flexible paid sick days fully paid Short Term Sick/Nursing Leave 16-week parental leave grandparent leave and additional company holidays.
- Insurance & Health:Pension Insurance Contribution Premium life insurance Private medical care (for you and 1) and a Global Employee Assistance Program
- Work Perks:Monthly meal and well-being allowance high-end MacBook/Windows laptop work-from-home support and in-office refreshments.
- Growth & Community:LinkedIn Learning internal mentoring educational support generous referral bonuses and optional company events (sports BBQs charity).
Be part of an inclusive innovative workplace that values belonging flexibility and growth!
Required Experience:
Staff IC