Staff Site Reliability Engineer

OVO Energy

Not Interested
Bookmark
Report This Job

profile Job Location:

London - UK

profile Monthly Salary: Not Disclosed
Posted on: 16 hours ago
Vacancies: 1 Vacancy

Job Summary

Role OVO-View

Location: Hub Based - Hybrid

Salary banding: 64070 - 84569

Experience:Mid-level/Expert

Working pattern: Full-Time

Reporting to: Principal Cloud Platform Engineer

Sponsorship: Unfortunately we are unable to offer sponsorship for this role.

This role in 3 words: Automation Resilience Observability

Top 3 qualities for this role: Analytical Proactive Collaborative

Where youll work:

Depending on the needs of your business area we expect hub based people to be in the office at least once a week and to go to OVO Connection events in-person.

Youll be assigned to the closest one of our three hub offices Bristol Glasgow or London; unless your role requires field-based work. Each hub has accessible spaces to park your laptop is designed to inspire people help them connect and bring big ideas to life.

Everyone belongs at OVO:

At OVO we are on a mission to solve one of humanitys biggest challenges the climate crisis. And we know it takes all of us to change the world. Thats why we need diverse people from all abilities gender identities ethnicities ages sexual orientations life experiences and backgrounds to join us.

Teamworking for the planet:

Everything we do here spins around Plan Zero. So naturally the team youll be joining plays a gigantic role in making that happen. Heres how:

Site Reliability Engineering is at the heart of OVOs customer-focused technology transformation building and maintaining scalable efficient and reliable platforms for OVOs applications and services. The goal of Site Reliability Engineering is to enhance the reliability performance and cost-efficiency of OVOs systems enabling teams to confidently deliver robust services in GCP. This focus on smart and efficient usage of cloud services also contributes to reducing CO2 usage which is at the heart of OVOs Plan Zero.

This role in a nutshell:

As a Site Reliability Engineer at OVO youll help ensure our systems are reliable scalable and efficient. Youll focus on maintaining high service availability improving performance and optimising how we monitor and respond to incidents. Your expertise in reliability engineering will support continuous improvement proactively resolve issues before they impact users and strengthen the overall resilience of our infrastructure.

Your key outcomes will be:

  • Developing Refining and Automating Monitoring Systems: Design manage and enhance monitoring alerting and observability systems - such as Datadog Prometheus and Grafana - ensuring they deliver meaningful insights and effective alerting. Youll also automate repetitive monitoring tasks to improve efficiency.

  • Managing SLOs/SLIs and Improving Incident Response: Define and track SLOs and SLIs for key services contributing to better reliability insights. Youll also help refine incident response processes support on-call operations and improve tooling and communication during incidents.

  • Incident Management and Post-Mortem Analysis: Play a key role in resolving complex production incidents leading or supporting technical response efforts. Following incidents youll conduct blameless post-mortems to uncover root causes and drive lasting improvements.

  • Cost Optimisation Implementation: Assess infrastructure usage and apply approved strategies to optimise cloud costs - balancing resource efficiency with performance and reliability.

  • Capacity Planning Performance Tuning & Resilience: Using monitoring and load testing data youll support capacity planning recommend performance improvements and help implement resilience best practices across systems.

  • Collaboration and Knowledge Sharing: Work closely with engineering QA security and product teams to embed reliability practices document key processes and mentor peers to support collective learning and growth.

  • Design Review Input:Take part in design reviews offering guidance on how to improve reliability scalability and day-to-day operability within system architecture.
  • Community of Practice: Actively contribute to your Community of Practice - leading discussions sharing experiences mentoring others and helping shape content and capability growth within your area of expertise.

Youll be a successful in this roleif you

  • Have a Software Engineering Background: You have professional experience in programming languages such as Python Typescript Go or Java and you apply software best practices (CI/CD unit testing code reviews) to infrastructure.

  • Experienced with the Cloud: You have hands-on experience navigating the complexities of public cloud ecosystems (AWS GCP or Azure) and understand the nuances of cloud-native networking and storage. You can demonstrate an understanding of how distributed systems may fail and how to design for fault tolerance.

  • Infrastructure as Code (IaC) Expert: You have advanced experience with Terraform Pulumi or Crossplane to manage at-scale infrastructure.

  • Data-Driven Mindset: You use metrics and logs to drive engineering decisions. You understand the foundations of SLOs and error budgets.

  • Problem Solver: You enjoy complex debugging. You can dive into the Linux kernel or network stack to find the root cause of a performance bottleneck.

  • Mentor & Advocate: You are passionate about teaching The SRE Way to engineers helping them take ownership of their services reliability.
  • Efficiency and Cost Engineering Mindset:You treat capacity planning performance tuning and cost optimisation as software engineering challenges rather than administrative tasks. You naturally lean toward building efficiency-as-code

Lets talk about whats in it for you:

Well pay you between 64070 and 84569 depending on your specific skills and experience.

We keep our pay ranges broad on purpose to give us and you flexibility to match your experience to our zero carbon mission.

Youll be eligible for an on-target bonus of 15%. We have one OVO bonus plan that focuses on the collective performance of our people to deliver our Plan Zero goal.

We also offer plenty of green benefits and progressive policies to help you feel like you belong at OVOand theres flex pay. Well give you 9% Flex Pay on top of your salary 4% of this is auto enrolled into your pension and the remaining 5% is yours to do what you like with. You can use this to buy from our extensive range of flexible benefits including our green benefits which weve put at the heart of our offering add to your pension or even take it as cash.

Heres a taster of whats on offer:

For starters youll get 34 days of holiday (including bank holidays).

For your health
With benefits like a healthcare cash plan or private medical insurance depending on your career level critical illness cover life assurance health assessments and more

For your wellbeing
With gym membership travel insurance workplace ISA will writing services dental insurance and more

For your lifestyle
With extra holiday buying discount dining home & tech loans and supporting your favourite charities with give-as-you-earn donations

For your home
Get up to 400 towards any OVO Energy plan plus great discounts on solar smart thermostats and EV chargers

For your commute
Nab a great deal on ultra-low emission car leasing plus our cycle to work scheme and public transport season ticket loans

Want to hear about our full range of flexible benefits and progressive people policies Our People Team can tell you everything you need to know.

For your Belonging

To find better ways to support our people we need to listen to each others experiences and find ways to build a truly inclusive and diverse workplace. As part of this we have 8 Belonging Networks at OVO. Led by our people for our people - so when you join OVO you can play a part - big or small - with any of the Networks. Its up to you.

Oh and one last thing...

Wed be thrilled if you tick off all our boxes yet we also believe its just as important we tick off all of yours. And if you think you have most of what were looking for but not every single thing go ahead and hit apply. Wed still love to hear from you!

If you have any additional requirements theres a space to let us know on the application form; we want to make the process as easy and comfortable for you as possible.


Required Experience:

Staff IC

Role OVO-ViewLocation: Hub Based - HybridSalary banding: 64070 - 84569Experience:Mid-level/ExpertWorking pattern: Full-TimeReporting to: Principal Cloud Platform EngineerSponsorship: Unfortunately we are unable to offer sponsorship for this role.This role in 3 words: Automation Resilience Observabil...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting