Senior Site Reliability Engineer – Linux

Hyderabad - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Build and maintain platform automation for provisioning deployment patching and remediation tasks.
Enhance observability frameworks implementing monitoring logging and alerting for Linux workloads.
Design and implement health checks SLIs/SLOs for availability and reliability tracking.
Collaborate with application and DevOps teams to ensure services follow reliability best practices.
Develop and maintain CI/CD pipelines using Jenkins GitLab CI or equivalent tools.
Implement Infrastructure-as-Code (IaC) solutions using Terraform Ansible or CloudFormation.
Participate in readiness reviews for new service releases.
Conduct root cause analysis and lead post-incident reviews to drive reliability improvements.
Partner with Information Security teams to enforce compliance patching and hardening standards.
Optimize system performance and capacity across compute storage and container environments.
Automate recurring operational tasks to enhance efficiency and reduce manual intervention.

Qualifications :

Bachelors degree in Computer Science Engineering or related discipline (or equivalent experience).
7 years of hands-on experience in large-scale Linux system engineering reliability or operations.
3 years designing implementing and maintaining enterprise distributed systems.
Expertise in Linux distributions and associated system services.
Strong knowledge of cloud environments and hybrid infrastructure models.
Proficiency in Bash Python and infrastructure automation

Hands-on experience with CI/CD configuration management and version control systems
Solid understanding of containerization and orchestration.
Proven troubleshooting and performance tuning skills for distributed and containerized systems.
Familiarity with observability tools (Prometheus Grafana ELK Datadog).
Strong grasp of networking DNS TLS and load balancing concepts.
Excellent communication and collaboration skills with cross-functional teams.

Additional Information :

Cloud-native Linux deployments on AWS Azure or GCP.
Experience with service mesh (Istio Linkerd) and API gateways.
Exposure to automation frameworks for security hardening (OpenSCAP CIS benchmarks).
Experience with log analytics distributed tracing (Jaeger OpenTelemetry).
Familiarity with database performance tuning (MySQL PostgreSQL or MongoDB).
Scripting for continuous compliance and infrastructure drift management.
Experience in managing Linux-based container platforms and Kubernetes clusters at scale.

** The candidate should be willing to work in core US shift schedules. **

Remote Work :

Employment Type :

Full-time

Qualifications :

Additional Information :

** The candidate should be willing to work in core US shift schedules. **

Remote Work :

Employment Type :

Full-time

Key Skills

Kubernetes
FMEA
Continuous Improvement
Elasticsearch
Go
Root cause Analysis
Maximo
CMMS
Maintenance
Mechanical Engineering
Manufacturing
Troubleshooting

Apply Now

About Company

Sutherland

Sutherland is seeking an organized and reliable person to join us as Admin Specialist. We are a group of driven and supportive individuals. If you are looking to build a fulfilling career and are confident you have the skills and experience to help us succeed, we want to work with you ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click