Site Reliability Engineer

CGI

Not Interested
Bookmark
Report This Job

profile Job Location:

London - UK

profile Monthly Salary: Not Disclosed
Posted on: 23 hours ago
Vacancies: 1 Vacancy

Job Summary

Position Description:

We are seeking an experienced and proactive Site Reliability Engineer (SRE) to join a team supporting multiple data product and platform groups. This role is focused on improving the reliability scalability observability and operational performance of critical data-driven platforms and services across complex production environments.

The successful candidate will work closely with engineering platform and support teams to strengthen monitoring and alerting capabilities improve logging and traceability troubleshoot production incidents support deployments and automate operational processes wherever possible. The environment includes Kubernetes Helm the ELK stack and a strong focus on modern Site Reliability Engineering practices across cloud and platform services.

This is a hands-on technical role suited to someone who thrives in fast-paced operational environments and is passionate about reliability engineering automation and continuous improvement. The role requires strong collaboration with both client stakeholders and engineering teams to ensure platform stability operational excellence and high service availability

Your future duties and responsibilities:

- Support maintain and improve highly available production platforms and services across cloud and containerised environments.
- Manage and support Kubernetes clusters and Helm-based deployments across multiple environments.
- Implement and enhance monitoring alerting logging and observability solutions to improve platform reliability and operational visibility.
- Investigate incidents analyse logs identify root causes and drive timely resolution of production issues.
- Participate in incident response post-incident reviews and continuous operational improvement initiatives.
- Automate operational tasks and repetitive support activities to reduce manual effort and improve platform efficiency.
- Work closely with engineering and data platform teams to improve system resilience scalability deployment reliability and operational maturity.
- Develop and maintain operational documentation support procedures runbooks and troubleshooting guides.
- Contribute to reliability engineering practices including proactive monitoring service health management and operational readiness.
- Support deployment activities release processes and production change management activities.

Required qualifications to be successful in this role:

- Strong commercial experience in Site Reliability Engineering Platform Engineering DevOps or Production Support environments.
- Strong hands-on experience with Kubernetes and Helm in enterprise or production environments.
- Proven experience supporting mission-critical production platforms and operational support functions.
- Strong hands-on experience with the ELK stack (Elasticsearch Logstash Kibana) for logging monitoring troubleshooting and operational analysis.
- Demonstrated capability in log analysis incident investigation troubleshooting and root cause analysis.

- Strong understanding and practical experience with core SRE practices including:
Monitoring and alerting
Incident management and response
Root cause analysis and post-incident reviews
Automation and operational improvement
Production support and reliability engineering

-Experience working with data platforms analytics platforms or data product teams would be highly advantageous.
- Experience with scripting and automation tools such as Bash Python or similar technologies is desirable.
- Exposure to CI/CD pipelines Infrastructure as Code and cloud-native environments would be beneficial.
- Strong communication stakeholder engagement and collaboration skills.
- Ability to work effectively in fast-paced support environments and manage competing priorities under pressure.

Security Clearance
- Resource must be willing and able to work onsite at the client location five days per week.
- Candidate must already hold current HLC clearance (mandatory requirement).
- Previous experience working within secure government defence or highly regulated environments will be highly regarded.
- Due to client security requirements only candidates meeting the required clearance criteria will be considered.

#LI-CGISDI

Skills:

  • Amazon Elastic Cloud Compute
  • Elastic Stack & Elasticsearch
  • Helm
  • Linux
  • BASH
  • Kubernetes
  • Python
  • Windows

What you can expect from us:

Together as owners lets turn meaningful insights into action.

Life at CGI is rooted in ownership teamwork respect and belonging. Here youll reach your full potential because

You are invited to be an owner from day 1 as we work together to bring our Dream to life. Thats why we call ourselves CGI Partners rather than employees. We benefit from our collective success and actively shape our companys strategy and direction.

Your work creates value. Youll develop innovative solutions and build relationships with teammates and clients while accessing global capabilities to scale your ideas embrace new opportunities and benefit from expansive industry and technology expertise.

Youll shape your career by joining a company built to grow and last. Youll be supported by leaders who care about your health and well-being and provide you with opportunities to deepen your skills and broaden your horizons.

Come join our teamone of the largest IT and business consulting services firms in the world.


Required Experience:

IC

Position Description:We are seeking an experienced and proactive Site Reliability Engineer (SRE) to join a team supporting multiple data product and platform groups. This role is focused on improving the reliability scalability observability and operational performance of critical data-driven platfo...
View more view more

About Company

Company Logo

The COMPANY is one of the few end-to-end consulting firms with the scale, reach, capabilities and commitment to meet clients’ enterprise digital transformation needs. Our 77,500 consultants and professionals work side-by-side with clients in 10 industries across more than 400 location ... View more

View Profile View Profile