Senior Site Reliability Engineer

Mars Capital

Not Interested
Bookmark
Report This Job

profile Job Location:

Dublin - Ireland

profile Monthly Salary: Not Disclosed
Posted on: 5 hours ago
Vacancies: 1 Vacancy

Job Summary

 

Role Overview

We are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in AWS cloud infrastructure containerised platforms and Azure DevOps CI/CD pipelines. The successful candidate will focus on improving system reliability availability performance and scalability while enabling engineering teams to deliver high-quality services efficiently.

This role combines engineering and operational excellence with a focus on automation observability scalability and resilience across cloud-native environments. As a senior engineer you will drive engineering-led solutions to reduce operational toil enhance system reliability and promote DevOps and SRE best practices.

Note: This is a reliability-focused engineering role with on-call responsibilities and involvement in platform modernisation initiatives.

Key Responsibilities

  • Design implement and manage highly available and scalable infrastructure on AWS.
  • Build maintain and optimise DevOps Pipelines (CI/CD) for automated build test and deployment processes.
  • Implement end-to-end CI/CD workflows including multi-stage pipelines approvals and release strategies.
  • Manage and support Windows () and Linux-based production systems.
  • Deploy manage and optimise containerised applications using Docker and Kubernetes (EKS/AKS).
  • Implement Infrastructure as Code (IaC) using Terraform CloudFormation or ARM
  • Develop and maintain automation scripts using PowerShell Bash or Python.
  • Define and monitor SLIs SLOs and SLAs to ensure system reliability.
  • Implement robust monitoring logging and alerting solutions (CloudWatch Prometheus Grafana Azure Monitor).
  • Lead incident management troubleshooting and root cause analysis (RCA) for production issues.
  • Drive performance tuning and capacity planning for applications and infrastructure.
  • Collaborate with development teams to improve deployment strategies (blue-green canary releases).
  • Ensure security compliance and best practices across CI/CD pipelines and infrastructure.

Qualifications :

 

Required Skills & Experience

  • 8 years of experience in Site Reliability Engineering / DevOps / Infrastructure Engineering
  • Strong hands-on experience with AWS services (EC2 S3 RDS VPC IAM ELB Auto Scaling CloudWatch)
  • Deep expertise in Azure DevOps Pipelines (CI/CD) including YAML pipelines and release automation
  • Experience designing multi-stage pipelines and deployment strategies
  • Expertise in Windows Server administration including IIS application support
  • Strong experience with Linux system administration
  • Hands-on experience with Docker and Kubernetes (EKS/AKS)
  • Experience with Infrastructure as Code (Terraform CloudFormation or ARM templates)
  • Strong scripting skills in PowerShell (mandatory) and Bash/Python
  • Experience with monitoring and logging tools (Prometheus Grafana ELK CloudWatch)
  • Solid understanding of networking security and cloud architecture principles

Preferred Qualifications

  • Experience with hybrid cloud or multi-cloud environments
  • Knowledge of Active Directory Group Policy and enterprise Windows environments
  • Familiarity with Helm GitOps practices or service mesh technologies
  • Experience with performance testing and tuning
  • Relevant certifications (AWS Kubernetes Azure DevOps)

Key Competencies / Characteristics

  • Reliability-driven: Focused on uptime performance and system resilience
  • Automation-first mindset: Continuously reduces manual effort and operational toil
  • Ownership mentality: Takes end-to-end responsibility from design through production
  • Strong communicator: Clearly articulates incidents RCA outcomes and technical concepts
  • Collaborative: Works effectively with platform security and application teams
  • Mentorship mindset: Actively supports and develops junior team members
  • Continuous learner: Keeps up with evolving SRE practices and cloud-native technologies

Additional Information :

D&I statement


Remote Work :

No


Employment Type :

Full-time

 Role OverviewWe are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in AWS cloud infrastructure containerised platforms and Azure DevOps CI/CD pipelines. The successful candidate will focus on improving system reliability availability performance and scalability while enablin...
View more view more

About Company

Due to continued growth of our servicing platform we are looking for a Team Leader to support the business as it goes through this current period of growth. The successful candidates will act as team leader for a team of Customer Service Executives and Asset Managers working within th ... View more

View Profile View Profile