Site Reliability Engineer (SRE)

Pune - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

The job posting is outdated and position may be filled

Job Summary

About Parkar:

At Parkar we are redefining cloud transformation with the power of Microsoft Azure and AWS platforms. Our commitment to aligning with industry-leading cloud technologies enables us to deliver tailored best-in-class solutions that drive tangible results for our clients.

With over a decade of experience and more than 100 enterprise engagements across diverse industries we specialize in empowering businesses to innovate and scale. From AIOps to Generative AI and advanced Machine Learning our expertise spans the technologies that shape tomorrows enterprises.

As a Microsoft Gold Partner we boast over 125 Azure certifications and a track record of 400 successful projects powered by our team of 200 skilled engineers. Whether its data analytics DevOps automation or application development Parkar is dedicated to achieving each clients unique transformation goals with agility precision and a relentless pursuit of excellence.

Vector is our flagship platform built on the latest in IT and AI technology to empower organizations to work smarter and stay ahead in a fast-changing digital world. It streamlines digital operations with intelligent automation and seamless cloud integration. With features like real-time monitoring early problem detection and scalable flexibility Vector helps reduce costs boost reliability and drive continuous innovation.

For more info. Visit our website:

About Role:

We are looking for a highly skilled Site Reliability Engineer (SRE) to join our team. The ideal candidate will focus on enhancing system reliability automation and performance ensuring high availability and scalability of our applications. You will work closely with development and operations teams to improve deployment pipelines monitoring and incident response.

Your Role at a Glance:

Design develop and maintain scalable reliable and secure infrastructure.
Implement monitoring logging and alerting solutions using tools like Datadog (Required); experience with SolarWinds Prometheus Grafana ELK Stack or Splunk is an advantage.
Creation of automation scripts and playbooks for repetitive tasks
Assist Deployment and configuration of observability stack
Create dashboard for availability latency SLO/SLA tracking in Datadog (We are migrating from SolarWinds to Datadog)
Improve system observability and enhance incident response through automation and root cause analysis.
Optimize CI/CD pipelines to ensure smooth deployments and minimal downtime.
Automate infrastructure provisioning and management using Terraform Ansible or Kubernetes. (good to have)
Ensure high availability and disaster recovery through load balancing failover mechanisms and backups.
Collaborate with development teams to enhance application performance reliability and scalability.
Manage cloud-based environments (AWS Azure or GCP) for efficient resource utilization.
Enhance security best practices including vulnerability assessments and patch management.
Participate in on-call rotations to troubleshoot and resolve critical system issues.

The Expertise Youll Bring:

5 years of experience in Site Reliability Engineering DevOps or Infrastructure roles.
Strong knowledge of Windows Linux/Unix systems and shell scripting.
Hands-on experience with cloud platforms (AWS Azure or GCP) and Jira dashboard.
Expertise in Kubernetes Docker and container orchestration.
Experience with CI/CD tools like Copado (Required) Jenkins GitHub Actions or GitLab CI.
Proficiency in Infrastructure as Code (IaC) tools like Terraform Ansible or CloudFormation.
Solid experience with monitoring and observability tools (Datadog Required Prometheus Grafana ELK Splunk or New Relic).
Strong knowledge of networking security and system architecture.
Experience with scripting languages like Python Bash or Go.
Familiarity with database performance tuning and optimization.
Strong problem-solving skills and ability to work in a fast-paced Agile environment.

Education

Bachelors degree in computer science engineering or similar domain.

About Parkar:At Parkar we are redefining cloud transformation with the power of Microsoft Azure and AWS platforms. Our commitment to aligning with industry-leading cloud technologies enables us to deliver tailored best-in-class solutions that drive tangible results for our clients.With over a decade...

About Parkar:

For more info. Visit our website:

About Role:

Your Role at a Glance:

Design develop and maintain scalable reliable and secure infrastructure.
Implement monitoring logging and alerting solutions using tools like Datadog (Required); experience with SolarWinds Prometheus Grafana ELK Stack or Splunk is an advantage.
Creation of automation scripts and playbooks for repetitive tasks
Assist Deployment and configuration of observability stack
Create dashboard for availability latency SLO/SLA tracking in Datadog (We are migrating from SolarWinds to Datadog)
Improve system observability and enhance incident response through automation and root cause analysis.
Optimize CI/CD pipelines to ensure smooth deployments and minimal downtime.
Automate infrastructure provisioning and management using Terraform Ansible or Kubernetes. (good to have)
Ensure high availability and disaster recovery through load balancing failover mechanisms and backups.
Collaborate with development teams to enhance application performance reliability and scalability.
Manage cloud-based environments (AWS Azure or GCP) for efficient resource utilization.
Enhance security best practices including vulnerability assessments and patch management.
Participate in on-call rotations to troubleshoot and resolve critical system issues.

The Expertise Youll Bring:

5 years of experience in Site Reliability Engineering DevOps or Infrastructure roles.
Strong knowledge of Windows Linux/Unix systems and shell scripting.
Hands-on experience with cloud platforms (AWS Azure or GCP) and Jira dashboard.
Expertise in Kubernetes Docker and container orchestration.
Experience with CI/CD tools like Copado (Required) Jenkins GitHub Actions or GitLab CI.
Proficiency in Infrastructure as Code (IaC) tools like Terraform Ansible or CloudFormation.
Solid experience with monitoring and observability tools (Datadog Required Prometheus Grafana ELK Splunk or New Relic).
Strong knowledge of networking security and system architecture.
Experience with scripting languages like Python Bash or Go.
Familiarity with database performance tuning and optimization.
Strong problem-solving skills and ability to work in a fast-paced Agile environment.

Education

Bachelors degree in computer science engineering or similar domain.

Key Skills

Kubernetes
FMEA
Continuous Improvement
Elasticsearch
Go
Root cause Analysis
Maximo
CMMS
Maintenance
Mechanical Engineering
Manufacturing
Troubleshooting

Apply Now

About Company

Parkar Digital

Job Title: Senior HCM System Administrator Location: Chicago, IL (100% Remote) Job Type: Full-Time ESSENTIAL DUTIES AND RESPONSIBILITIES : Serve as a member of the Digital & Technology Solutions department partnering with HR and People Technology areas to support the configuration an ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click