Site Reliability Engineer (SRE) – Azure

Not Interested
Bookmark
Report This Job

profile Job Location:

Atlanta, GA - USA

profile Monthly Salary: Not Disclosed
profile Experience Required: 5years
Posted on: 5 hours ago
Vacancies: 1 Vacancy

Job Summary

Role Overview
We are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure and proven experience supporting environments within the Banking or Financial Services industry. This role is responsible for designing and maintaining reliable scalable and secure cloud infrastructure while ensuring high availability and optimal performance of mission-critical applications in a regulated environment.

The ideal candidate brings a strong production support mindset and can effectively balance system reliability automation and delivery speed.


Key Responsibilities

  • Design deploy and manage highly available scalable cloud infrastructure on Microsoft Azure.

  • Enhance system reliability performance and uptime through automation and proactive monitoring.

  • Build maintain and optimize CI/CD pipelines for enterprise and cloud-native applications.

  • Define track and improve SLIs SLOs and SLAs.

  • Implement and manage observability solutions including logging monitoring and alerting.

  • Support incident response perform root cause analysis and drive post-incident improvements.

  • Automate infrastructure provisioning using Infrastructure as Code (IaC) practices.

  • Ensure infrastructure and applications comply with banking security and regulatory standards.

  • Collaborate closely with DevOps development security and operations teams.



Requirements

Required Skills & Experience

  • 712 years of experience in Site Reliability Engineering DevOps or Production Engineering roles.

  • Strong hands-on experience with Microsoft Azure services including VMs AKS App Services Networking Storage and Azure AD.

  • Experience with Infrastructure as Code tools such as Terraform ARM templates or Bicep.

  • Expertise in CI/CD tools such as Azure DevOps Jenkins or GitHub Actions.

  • Strong scripting skills using PowerShell Python or Bash.

  • Experience with Docker and Kubernetes container orchestration.

  • Prior experience working within Banking or Financial Services environments.

  • Solid understanding of security compliance and risk management in regulated industries.


Preferred Qualifications

  • Experience with monitoring and observability tools such as Azure Monitor Prometheus Grafana or Splunk.

  • Knowledge of high availability and disaster recovery architectures.

  • Familiarity with ITIL processes and incident management frameworks.

  • Microsoft Azure certifications such as AZ-104 AZ-400 or equivalent are a plus.




Required Skills:

Required Skills & Experience 712 years of experience in Site Reliability Engineering DevOps or Production Engineering roles. Strong hands-on experience with Microsoft Azure services including VMs AKS App Services Networking Storage and Azure AD. Experience with Infrastructure as Code tools such as Terraform ARM templates or Bicep. Expertise in CI/CD tools such as Azure DevOps Jenkins or GitHub Actions. Strong scripting skills using PowerShell Python or Bash. Experience with Docker and Kubernetes container orchestration. Prior experience working within Banking or Financial Services environments. Solid understanding of security compliance and risk management in regulated industries. Preferred Qualifications Experience with monitoring and observability tools such as Azure Monitor Prometheus Grafana or Splunk. Knowledge of high availability and disaster recovery architectures. Familiarity with ITIL processes and incident management frameworks. Microsoft Azure certifications such as AZ-104 AZ-400 or equivalent are a plus.

Role Overview We are seeking an experienced Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure and proven experience supporting environments within the Banking or Financial Services industry. This role is responsible for designing and maintaining reliable scalable and secure cl...
View more view more

Company Industry

IT Services and IT Consulting

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting