SRE

Technopride Ltd

Job Location:

Brighton and Hove - UK

Monthly Salary: Not Disclosed

Experience Required: 5years

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Role: SRE

Location: Hove UK

Is it Permanent / Contract: Open for both Perm/Contract

Is it Onsite/Remote/Hybrid: 2days per week from office

No. of Positions: 1

We are seeking an experienced Site Reliability Engineer (SRE) to drive the modernization of IT operations through the implementation of observability practices automation and reliability engineering principles. The role requires a strategic thinker with strong hands-on expertise who can enhance system reliability scalability and operational efficiency while reducing manual operational tasks.

The successful candidate will work closely with engineering architecture and product teams to implement modern reliability practices automate operational workflows and establish robust monitoring and incident management frameworks.

Key Responsibilities

Collaborate with engineering teams to modernize IT operations by improving observability automation and operational efficiency.

Design and implement observability platforms to effectively monitor system health performance and reliability.

Develop strategies for AI-driven alerting and proactive anomaly detection to reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR).

Establish and enforce SRE best practices including Service Level Indicators (SLIs) Service Level Objectives (SLOs) and Error Budgets.

Define and implement an AIOps roadmap to enhance operational intelligence and automation.

Automate repetitive operational tasks (toil reduction) using scripting orchestration tools and automation frameworks.

Implement self-healing systems and automated incident response mechanisms to support autonomous operations.

Collaborate with cross-functional teams to ensure systems are scalable resilient and maintainable.

Lead incident management root cause analysis and post-incident improvement initiatives.

Promote shift-left reliability practices across engineering and product teams.

Mentor team members and advocate for a culture of reliability automation and continuous improvement.

Required Skills & Experience

Strong expertise in Site Reliability Engineering (SRE) principles and practices.

Hands-on experience implementing observability solutions particularly with Dynatrace and Datadog.

Strong scripting and automation experience using Python and Ansible.

Experience working with cloud platforms such as AWS and Azure.

Solid understanding of containerization and orchestration technologies including Docker and Kubernetes.

Experience working with cloud-native distributed systems and microservices architectures.

Role: SRE Location: Hove UK Is it Permanent / Contract: Open for both Perm/Contract Is it Onsite/Remote/Hybrid: 2days per week from office No. of Positions: 1 We are seeking an experienced Site Reliability Engineer (SRE) to drive the modernization of IT operations through the implementation of obs...

Role: SRE

Location: Hove UK

Is it Permanent / Contract: Open for both Perm/Contract

Is it Onsite/Remote/Hybrid: 2days per week from office

No. of Positions: 1

Key Responsibilities

Collaborate with engineering teams to modernize IT operations by improving observability automation and operational efficiency.

Design and implement observability platforms to effectively monitor system health performance and reliability.

Develop strategies for AI-driven alerting and proactive anomaly detection to reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR).

Establish and enforce SRE best practices including Service Level Indicators (SLIs) Service Level Objectives (SLOs) and Error Budgets.

Define and implement an AIOps roadmap to enhance operational intelligence and automation.

Automate repetitive operational tasks (toil reduction) using scripting orchestration tools and automation frameworks.

Implement self-healing systems and automated incident response mechanisms to support autonomous operations.

Collaborate with cross-functional teams to ensure systems are scalable resilient and maintainable.

Lead incident management root cause analysis and post-incident improvement initiatives.

Promote shift-left reliability practices across engineering and product teams.

Mentor team members and advocate for a culture of reliability automation and continuous improvement.

Required Skills & Experience

Strong expertise in Site Reliability Engineering (SRE) principles and practices.

Hands-on experience implementing observability solutions particularly with Dynatrace and Datadog.

Strong scripting and automation experience using Python and Ansible.

Experience working with cloud platforms such as AWS and Azure.

Solid understanding of containerization and orchestration technologies including Docker and Kubernetes.

Experience working with cloud-native distributed systems and microservices architectures.

Company Industry

IT Services and IT Consulting

Apply Now

About Company

Technopride Ltd

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

SRE

Brighton and Hove - UK

Job Summary

Company Industry

About Company

Related Jobs