Site Reliability Engineer

Right Advisors


Job Location:

Gurgaon - India

Monthly Salary: L 5 - 10
Experience Required: 3-7years
Posted on: 12 hours ago
Vacancies: 1 Vacancy

Job Summary

About the Role

About the Role We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) with 3 years of experience to ensure high availability reliability and performance of production systems. This role focuses on automation
incident management and cross-team coordination to drive operational excellence.

Key Responsibilities

Maintain reliable scalable and secure production environments.
Implement and manage monitoring alerting and logging solutions.
Contribute to defining and tracking SLIs/SLOs and support error budget practices.
Automate operational tasks to improve efficiency and reduce manual effort.
Perform troubleshooting and Root Cause Analysis (RCA) for production incidents.
Optimize system performance availability and capacity.
Maintain SOPs and incident documentation in Confluence.
Adhere to change management deployment governance and disaster recovery standards.
Support incident response for critical production services.
Collaboration & Tools
Coordinate with external vendors and internal cross-functional teams.
Work closely with Engineering Product Owners and Operations teams.
Manage incidents and changes using ServiceNow & JIRA.
Collaborate through Slack and structured communication channels.

Technical Skills Systems & Clouds
Strong knowledge of Windows and Linux/Unix systems
Solid understanding of networking fundamentals (DNS TCP/IP Load Balancing Firewalls).
Experience with at least one cloud platform (AWS Azure or GCP).
Automation & CI/CD
Proficiency in one scripting/programming language (Python Go Bash PowerShell or Java).
Understanding of CI/CD pipelines and automation practices.

Containers
Hands-on experience with Docker and Kubernetes
Experience with monitoring tools such as or Power BI.
Ability to analyze logs metrics and traces for troubleshooting.

ITSM & Documentation
Experience with ServiceNow & JIRA (incident/change/problem workflows)
Working knowledge of Confluence for technical documentation and knowledge management.

Additional Experience (Preferred)
Background in DevOps Cloud Engineering or Platform Engineering
Understanding of security best practices and compliance standards.
Familiarity with AI-assisted engineering tools (Claude Code Jellyfish GitHub Copilot
Exposure to large-scale or production-grade systems.
Soft Skills
Strong analytical and troubleshooting mindset
Excellent written and verbal communication skills
Ownership driven and composed during high level severity incidents
Accessibility & Inclusion Statement
We are committed to creating an inclusive environment for all employees including persons with disabilities. Reasonable accommodations will be provided upon request.



Required Skills:

sredevopstcp/ipdnslinuxawsazure

About the RoleAbout the Role We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) with 3 years of experience to ensure high availability reliability and performance of production systems. This role focuses on automation incident management and cross-team coordination to dr...

About Company

Company Logo

We are one of the fastest growing HR services organization. We create long-term sustainable partnerships with our clients by providing resource solutions to meet their business needs. Expertise and leadership propelled with a successful service model, which is intrinsic to our clients ... View more

View Profile View Profile