Lead Site Reliability Engineer ServiceNow Platform

VREZOLV PARTNERS PRIVATE LIMITED

Posted on : 07-08-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Hyderabad - India

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 07-08-2025

Job Description

Lead Site Reliability Engineer (ServiceNow Platform)

What you get to do in this role:

As the Lead Site Reliability Engineer (SRE) you will spearhead the design and implementation of observability and reliability strategies across our ServiceNow platform and integrated third-party systems. Youll lead the charge in establishing and maturing telemetry frameworks ensuring the visibility of golden signals-latency traffic errors and saturation-to drive proactive performance and availability management.

This role is both strategic and hands-on. You will mentor other engineers collaborate with cross-functional teams and influence platform-wide improvements. Your work will directly enhance system resilience user experience and operational excellence.

Key Responsibilities:

Architect and implement telemetry and observability frameworks across ServiceNow and its ecosystem.
Define and monitor golden signals to drive proactive SRE practices.
Lead incident and problem management reviews ensuring data-driven root cause analysis and continuous improvement.
Collaborate with development support and infrastructure teams to implement self-healing auto-remediation and resiliency patterns.
Develop and mature dashboards and real-time alerts using tools like ServiceNow Platform along with Datadog Splunk or Grafana.
Drive automation for reliability checks capacity planning and environment health.
Establish and promote SRE best practices playbooks and operational readiness standards across product teams.
Represent SRE in architectural reviews and platform governance meetings.
Mentor junior engineers foster a learning culture and ensure adoption of reliability-first principles.

Qualifications:

Bachelors or Masters degree in Computer Science Engineering or related technical field.
10 years of IT experience with 5 years in SRE or production engineering and 2 years in a lead or principal role.
Proven experience in managing observability telemetry and incident response frameworks at scale.
Deep understanding of ITIL-aligned processes (Incident Problem Change).
Strong leadership and collaboration skills with the ability to influence across engineering and business teams.
Excellent verbal and written communication especially in articulating technical decisions to business stakeholders.

Technical Requirements:

Strong experience with monitoring tools such as Datadog Splunk Prometheus Grafana or equivalents.
Proficient in ServiceNow platform administration performance tuning and API integrations.
Solid command over Unix/Linux internals system performance tuning and network troubleshooting.
Proficient in one or more scripting languages: Python Shell JavaScript.
Hands-on experience with Kubernetes containers and CI/CD pipelines.
Deep understanding of HTTP/S DNS SSL/TLS and other web protocols.
Familiarity with cloud platforms (AWS Azure or GCP); certifications preferred.

Preferred (Nice to Have):

Experience with ServiceNow ITOM modules like Event Management AIOps and Discovery.
Knowledge of AI/ML-based anomaly detection and alerting strategies.
Experience with infrastructure-as-code using tools like Ansible Terraform.
Familiarity with performance profiling and diagnostics of complex applications.
Previous success in establishing SRE teams or practices from the ground up.

Datadog,Splunk,ServiceNow Platform Administration,Kubernetes

Employment Type

Full Time

Company Industry

Key Skills

Apply Now

About Company

VREZOLV PARTNERS PRIVATE LIMITED

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Lead Site Reliability Engineer ServiceNow Platform

VREZOLV PARTNERS PRIVATE LIMITED

Job Description

Lead Site Reliability Engineer (ServiceNow Platform)

What you get to do in this role:

Key Responsibilities:

Qualifications:

Technical Requirements:

Preferred (Nice to Have):

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Civil Design Engineer/Lead

Design Engineer- Secondary (Substation)

Banking Process Management Engineer - L3

Site Facilities Coordinator

Site Development Project Manager

Lead Accountant

Lead Machinist

Lead Machinist