Staff Resilience Engineer (HFN)

Believe

Not Interested
Bookmark
Report This Job

profile Job Location:

Paris - France

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

Platform Engineering & Operational Intelligence

Context & Mission

Believes Platform Engineering organization is building a resilient observable and scalable cloud platform serving engineering teams across the company.

Within Operational Intelligence (OI) our mission is to improve performance safety productivity and operational maturity while preserving team autonomy through a Platform-as-a-Product approach.

We are looking for a Staff Resilience Engineer to elevate the platforms reliability posture and proactively strengthen its behavior under failure. This role operates at the intersection of resilience engineering incident leadership observability strategy and distributed systems architecture.

You will not only respond to critical incidentsyou will design systems practices and experiments that ensure failures are anticipated controlled and continuously learned from.

 

What You Will Own

1. Incident Leadership & Systemic Improvement

  • Own the organizations response to high-severity incidents.

  • Establish a clear scalable incident management model across teams.

  • Turn incidents into structural platform improvements.

  • Improve reliability KPIs (MTTR detection latency recurrence).

2. Resilience & Failure Engineering

  • Identify systemic architectural risks and scalability limits.

  • Design and institutionalize proactive resilience practices.

  • Lead and scale chaos engineering and controlled failure experimentation.

  • Ensure systems behave predictably under stress and partial failure.

  • Drive the evolution of self-healing and failover capabilities.

3. Observability & Reliability Strategy

  • Define the platform-wide observability vision and standards.

  • Improve signal quality detection speed and SLO maturity.

  • Align telemetry architecture with performance and cost efficiency goals.

  • Standardize instrumentation and reliability practices across squads.

4. Platform Reliability Evolution

  • Influence the long-term reliability posture of the platform.

  • Embed operational excellence in platform capabilities.

  • Partner with engineering and product leadership on roadmap priorities.

  • Raise the reliability bar across the organization.

 

Staff-Level Expectations

At Staff level you are expected to:

  • Influence architecture and operational strategy across multiple engineering groups.

  • Lead large-scale multi-quarter initiatives with high autonomy.

  • Identify cross-team risks before they materialize.

  • Shape engineering culture around resilience and operational maturity.

  • Mentor engineers and elevate platform squads technically.

  • Drive alignment among diverse stakeholders.

  • Balance strategic thinking with hands-on technical execution.


Qualifications :

Experience

  • 5 years in Platform Engineering Resilience Engineering Incident Management or similar.

  • Proven ownership of distributed systems in production.

  • Demonstrated leadership during high-impact incidents.

  • Experience influencing senior engineering stakeholders.

  • Cloud-native background (GCP preferred; AWS/Azure acceptable).

Technical Skills

  • Deep expertise in observability:

    • OpenTelemetry (SDKs semantic conventions Collector pipelines)

    • Datadog ELK or equivalent

  • Strong cloud-native fundamentals:

    • Kubernetes containers service meshes

    • Infrastructure-as-Code (Terraform Crossplane)

    • CI/CD automation (GitLab CI or similar)

  • Solid distributed systems Linux networking DNS load balancing knowledge.

  • Strong scripting/automation skills (Python Go Rust Bash).

  • Experience with auto-remediation and failure detection systems.

  • Understanding of security best practices and compliance.

Soft Skills

  • Ability to simplify complexity and drive alignment.

  • Calm structured decision-making under pressure.

  • Strong mentoring and technical leadership mindset.


Additional Information :

SET THE TONE WITH US:

Working at Believe means having individual and collective impact in a fast-growing company!  

At all stages of their careers Believers are an important part of what we are doing: shaping the future of the music industry. 

We need teams that truly reflect the diversity of our clients: our international presence is an inspiring and enriching work environment for each one of us with daily opportunities to connect with our colleagues all over the world. 

We have two hearts at Believe - our People and our Artists. 

We believe in THE POWER OF OUR PEOPLE who grow every day to develop their potential We aim to provide our Believers with the best environment to thrive. 

 

ROCK THE JOB 

  • Tailor-made training and coaching program 

  • Remote working policy

  • A wellness program Pauses with many activities and animations in-house

  • Access to Eutelmed a digital mental health and well-being platform that allows you to speak with an experienced psychologist

  • A healthy and eco-responsible company restaurant

  • Individual or family health insurance

  • CSE benefits 

  • A rooftop

  • A gym with free classes

SING IN HARMONY 

  • Ambassador program: an employee volunteering initiative dedicated to all Believers interested in having a positive impact on Diversity Equity & Inclusion (DEI) wellbeing and the planet.

  • Implementation of the sustainable mobility package Forfait mobilité durable > Reimbursement of up to 600 for public transport/low carbon footprint

  • 5 calendar days 2nd parent leave with 100% pay (in addition to the legal paternity or adoption leave)

We are committed to having a workforce that is representative of the community it serves at all levels of the organisation. We therefore welcome applications from all backgrounds and all sections of the community regardless of age disability gender race religion and sexual orientation.


Remote Work :

No


Employment Type :

Full-time

Platform Engineering & Operational IntelligenceContext & MissionBelieves Platform Engineering organization is building a resilient observable and scalable cloud platform serving engineering teams across the company.Within Operational Intelligence (OI) our mission is to improve performance safety pro...
View more view more

Key Skills

  • Computer Science
  • Docker
  • Kubernetes
  • Python
  • VMware
  • C/C++
  • Go
  • System Architecture
  • gRPC
  • OS Kernels
  • Perl
  • Distributed Systems

About Company

Company Logo

Believe is one of the world’s leading digital music companies. Believe’s mission is to develop independent artists and labels in the digital world by providing them the solutions they need to grow their audience at each stage of their career and development. Believe’s passionate team ... View more

View Profile View Profile