SRE

Bangalore - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

SRE with AWS Elastic Search Kubernetes Graphana with 10 years of experience for Bangalore.

all 5 days office

Location : Kormangla Bangalore

25LPA

Job Description

We are seeking a highly experienced Site Reliability Engineer (SRE) with 10 years of experience in designing implementing and maintaining highly available scalable and resilient systems. The ideal candidate will have deep expertise in AWS Kubernetes Elasticsearch Grafana and modern SRE practices with a strong focus on automation observability and operational excellence.

Key Responsibilities

Design build and operate highly reliable scalable and fault-tolerant systems in AWS cloud environments.
Implement and manage Kubernetes (EKS) clusters including deployment strategies scaling upgrades and security hardening.
Own and improve SLIs SLOs and SLAs driving reliability through data-driven decisions.
Architect and maintain observability platforms using Grafana Prometheus and Elasticsearch.
Manage and optimize Elasticsearch clusters including indexing strategies performance tuning scaling and backup/restore.
Develop and maintain monitoring alerting and logging solutions to ensure proactive incident detection and response.
Lead incident management root cause analysis (RCA) postmortems and continuous improvement initiatives.
Automate infrastructure and operations using Infrastructure as Code (IaC) and scripting.
Collaborate with development teams to improve system reliability deployment pipelines and release processes.
Implement CI/CD best practices and reduce deployment risk through canary blue-green and rolling deployments.
Ensure security compliance and cost optimization across cloud infrastructure.
Mentor junior SREs and drive adoption of SRE best practices across teams.

Required Skills & Qualifications

Core Technical Skills

10 years of experience in Site Reliability Engineering DevOps or Platform Engineering.
Strong hands-on experience with AWS services (EC2 EKS S3 RDS IAM VPC CloudWatch Auto Scaling).
Advanced expertise in Kubernetes (EKS preferred) Helm and container orchestration.
Deep knowledge of Elasticsearch (cluster management indexing search optimization performance tuning).
Strong experience with Grafana and observability stacks (Prometheus Loki ELK).
Proficiency in Linux system administration and networking fundamentals.
Experience with Infrastructure as Code tools (Terraform CloudFormation).
Strong scripting skills in Python Bash or Go.

SRE with AWS Elastic Search Kubernetes Graphana with 10 years of experience for Bangalore.all 5 days officeLocation : Kormangla Bangalore25LPAJob DescriptionWe are seeking a highly experienced Site Reliability Engineer (SRE) with 10 years of experience in designing implementing and maintaining high...