Job Description:
We are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in Azure Cloud Architecture to design implement and optimize scalable secure and cost-effective cloud and reliability solutions. The ideal candidate will combine architectural skills with hands-on operational excellence in monitoring automation and performance optimization across Azure environments.
Key Responsibilities:
- Define and manage Service Level Objectives (SLOs) Service Level Indicators (SLIs) and Service Level Agreements (SLAs) to ensure reliability and availability targets.
- Evaluate existing systems and implement mechanisms for proactive monitoring alerting and incident response.
- Review and enhance NFR (Non-Functional Requirements) processes ensuring robust coverage of parameters such as performance scalability and reliability.
- Design and develop reliable and automated Azure cloud architectures aligned with business and operational goals.
- Implement CloudOps and SRE best practices including process standardization automation and TOIL reduction.
- Collaborate with cross-functional teams to integrate Azure services effectively while improving observability and system performance.
- Support FinOps initiatives to ensure cost optimization and efficient resource utilization.
- Drive initiatives for self-service enablement noise reduction and operational resilience.
- Mentor junior engineers and promote a culture of reliability ownership and continuous improvement.
Technical Skills (Mandatory):
- Azure Cloud Architecture Azure App Services Azure Functions Azure Monitor Azure Logic Apps Azure DevOps Azure SQL Azure Front Door Azure Service Bus.
- Infrastructure as Code (IaC): Terraform Ansible.
- Automation & CI/CD: Jenkins GitLab PowerShell Shell scripting.
- Application Stack: .NET Framework C# Java Spring Boot Angular JavaScript Entity Framework (EF/EF Core).
- Containers & Orchestration: Docker Kubernetes.
- Database: PostgreSQL.
- Architecture Patterns: Microservices Application Architecture Application Re-architecting Architectural Diagrams & Documentation.
Preferred Qualifications:
- Experience implementing SRE principles such as error budgets incident management postmortems and observability practices.
- Strong understanding of cloud security performance tuning and disaster recovery strategies.
- Familiarity with monitoring tools (e.g. Azure Monitor Application Insights Prometheus Grafana).
- Excellent problem-solving and cross-functional collaboration skills.
Job Description: We are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in Azure Cloud Architecture to design implement and optimize scalable secure and cost-effective cloud and reliability solutions. The ideal candidate will combine architectural skills with hands-on operati...
Job Description:
We are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in Azure Cloud Architecture to design implement and optimize scalable secure and cost-effective cloud and reliability solutions. The ideal candidate will combine architectural skills with hands-on operational excellence in monitoring automation and performance optimization across Azure environments.
Key Responsibilities:
- Define and manage Service Level Objectives (SLOs) Service Level Indicators (SLIs) and Service Level Agreements (SLAs) to ensure reliability and availability targets.
- Evaluate existing systems and implement mechanisms for proactive monitoring alerting and incident response.
- Review and enhance NFR (Non-Functional Requirements) processes ensuring robust coverage of parameters such as performance scalability and reliability.
- Design and develop reliable and automated Azure cloud architectures aligned with business and operational goals.
- Implement CloudOps and SRE best practices including process standardization automation and TOIL reduction.
- Collaborate with cross-functional teams to integrate Azure services effectively while improving observability and system performance.
- Support FinOps initiatives to ensure cost optimization and efficient resource utilization.
- Drive initiatives for self-service enablement noise reduction and operational resilience.
- Mentor junior engineers and promote a culture of reliability ownership and continuous improvement.
Technical Skills (Mandatory):
- Azure Cloud Architecture Azure App Services Azure Functions Azure Monitor Azure Logic Apps Azure DevOps Azure SQL Azure Front Door Azure Service Bus.
- Infrastructure as Code (IaC): Terraform Ansible.
- Automation & CI/CD: Jenkins GitLab PowerShell Shell scripting.
- Application Stack: .NET Framework C# Java Spring Boot Angular JavaScript Entity Framework (EF/EF Core).
- Containers & Orchestration: Docker Kubernetes.
- Database: PostgreSQL.
- Architecture Patterns: Microservices Application Architecture Application Re-architecting Architectural Diagrams & Documentation.
Preferred Qualifications:
- Experience implementing SRE principles such as error budgets incident management postmortems and observability practices.
- Strong understanding of cloud security performance tuning and disaster recovery strategies.
- Familiarity with monitoring tools (e.g. Azure Monitor Application Insights Prometheus Grafana).
- Excellent problem-solving and cross-functional collaboration skills.
View more
View less