drjobs Senior - Site Reliability Engineer

Senior - Site Reliability Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Hyderabad - India

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform Ansible Dynatrace to automate deployment and management of infrastructure.

Build and manage CI/CD pipelines to ensure efficient and reliable application deployments.

Improve infrastructure provisioning and configuration through automation minimizing manual interventions and reducing human error.

Monitor the health performance and reliability of production systems and applications.

Design implement and maintain automated monitoring solutions using tools such as Datadog

Define and monitor service level objectives (SLOs) service level indicators (SLIs) and error budgets to ensure system reliability and availability meet customer expectations.

Implement effective alerting systems to identify and address potential issues before they impact users.

Lead root cause analysis (RCA) and post-mortem investigations after incidents to identify improvements and avoid recurrence.

Respond to production incidents diagnose root causes and implement corrective actions.

Create and maintain playbooks and documentation for incident response troubleshooting and recovery processes.

Collaborate closely with development teams during the post-deployment phase to ensure smooth rollouts and address any production issues.

Work alongside software engineers to design deploy and scale applications that are highly available resilient and fault tolerant.

Provide guidance and support in ensuring that code is written with an operational mindset enabling easy deployment monitoring and debugging.

Act as a bridge between development operations and business teams ensuring that infrastructure and software align with business goals.

Experience working with cloud platforms such as AWS Microsoft Azure and/or GCP

Expertise with Git Jenkins CircleCI GitLab CI or similar CI/CD platforms.

Stay current with emerging technologies tools and trends in site reliability engineering DevOps and cloud computing.

Lead or contribute to internal initiatives aimed at improving system performance reliability and operational efficiency.

Propose and lead process improvements optimizations and innovations in automation and system design.

Strong written and verbal communication skills able to collaborate with cross-functional teams write documentation and explain technical concepts to non-technical stakeholders.

Ability to work effectively in a fast-paced environment collaborating with software developers other SREs operations teams and business stakeholders.

Employment Type

Full-time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.