Senior Site Reliability Engineer Cloud Operations Engineer (mfd)

Thales

Job Location:

Berlin - Germany

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Location: Berlin Germany

We Say HI*

Site Reliability Engineer / Cloud Operations Engineer (f/m/d)

German companies and public administrations in this country are ready to accelerate their digital transformation and the use of AIbut they will never compromise on the security of their most sensitive data. This is where ThalesinGermany in partnership with Google Cloud and our new company currently beingestablished comes into play. With a new 100% German business unit we are providing a concrete response to the strict requirements of the BSI. What we are creating is a locally and fully autonomously operated Trusted provides access to the broadest service portfolio on the market while everythingremainsstrictly under Europeanjurisdiction. By combining German and French standards such asSecNumCloud C5 and C3-A we offer our customersunequaledresilience and business continuity. This is a turning point for our industry and a decisive step towards a strong sovereign digital Europe.

Your missionasSite Reliability Engineer:

Operate and maintain mission-critical sovereign cloud services with availability targets of 99.99% and above.
Monitor service health reliability scalability latency and performance using Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
Investigate troubleshoot and resolve complex production incidents across large-scale distributed cloud environments.
Participate in a structured 24/7 on-call rotation (approximately one week every six weeks) to ensure continuous service availability.
Collaborate with Site Reliability Engineers Cloud Infrastructure Specialists and Product Experts across international teams to mitigate incidents and drive long-term solutions.
Build a deep understanding of Googles cloud technologies and distributed systems through an intensive training program covering technologies such as Borg Colossus Spanner and other core GCP components.
Drive operational excellence by creating and maintaining technical documentation standardizing incident response procedures and continuously improving operational playbooks.
Lead and contribute to post-incident reviews root cause analyses and the implementation of preventive measures to improve platform reliability.
Identify opportunities for automation and contribute to improving operational efficiency scalability compliance and service reliability.
Support the operation of highly secure cloud environments designed to meet stringent regulatory and sovereignty requirements.

We are looking forward to:

Several years of experience in Site Reliability Engineering Cloud Operations DevOps Platform Engineering Infrastructure Engineering Production Support Network Operations (NOC) Technical Operations or a comparable role.
Experience operating and supporting business-critical production systems with demanding uptime and availability requirements.
Strong troubleshooting and incident management skills in complex technical environments.
Experience monitoring operating and maintaining distributed systems cloud platforms infrastructure services or large-scale applications.
Familiarity with reliability engineering concepts observability monitoring alerting incident response and root cause analysis.
Experience working with automation scripting operational tooling or Infrastructure-as-Code approaches.
Strong analytical and problem-solving skills with a structured and methodical approach.
Professional proficiency in both German and English.
Willingness to participate in a regular on-call rotation.
Curiosity adaptability and a strong desire to learn and work with hyperscale cloud technologies.

The Group invests more than 45billion per year in Research & Development in key areas particularly for critical environments such as Artificial Intelligence cybersecurityquantumand cloud technologies.

In 2025 the Group generatedsales of 22.1billion.

For our more than85000 employeesin 65countries weopen upvisionary perspectives realise individual careerpathsand enable creative freedom. This is achieved with courageversatilityand the firm intention to make the demanding challenges of our time safer and more inclusive.With our sustainable value-focused management we support diversity actively.

Say HI* Your journey to us

At times of change our international teams are ready to meet the complexity of today with the industry-leading technologies of tomorrow. Will you be part of it YourTalent AcquisitioncontactAndre Fuhrmannis looking forward to your online application.

Andre Fuhrmann Talent Acquisition Partner

49 7156/302-22002

*Human Intelligence

#LI-AF1

#LI-HYBRID

Required Experience:

Senior IC

Location: Berlin GermanyWe Say HI*Site Reliability Engineer / Cloud Operations Engineer (f/m/d)German companies and public administrations in this country are ready to accelerate their digital transformation and the use of AIbut they will never compromise on the security of their most sensitive data...