Cloud Operations Lead SRE DevOps Platform Engineering

NewVison


Job Location:

Pune - India

Monthly Salary: Not Disclosed
Posted on: 3 hours ago
Vacancies: 1 Vacancy

Job Summary

Cloud Operations Lead SRE / DevOps / Platform Engineering

Experience

912 Years

Shift

Overlap with US & EU Business Hours

Role Summary

We are seeking an experienced Cloud Operations Lead with a strong background in Site Reliability Engineering (SRE) DevOps and Platform Engineering. The ideal candidate will be responsible for ensuring the reliability security and operational excellence of cloud-based platforms and services while leading a small team of engineers.

This is a hands-on role with approximately 80% focus on Cloud Operations Production Support Reliability and Platform Ownership combined with leadership responsibilities.

Key Responsibilities

  • Lead cloud operations and production support activities across AWS-based platforms.
  • Manage and troubleshoot Linux systems cloud infrastructure networking and Kubernetes environments.
  • Drive operational excellence through monitoring observability automation and incident management.
  • Build and maintain Infrastructure as Code (IaC) using Terraform Ansible and Helm.
  • Support and optimize CI/CD pipelines using GitHub Actions Jenkins and deployment automation tools.
  • Design and implement monitoring alerting dashboards runbooks and operational standards.
  • Lead vulnerability remediation secrets management access governance and platform hardening initiatives.
  • Automate infrastructure provisioning OS/AMI upgrades and day-2 operational activities.
  • Support production deployments release management and change control processes.
  • Collaborate with engineering teams on onboarding platform readiness access management and operational best practices.
  • Mentor and guide junior engineers while driving continuous service improvement.

Required Skills (Non-Negotiable)

  • Strong Linux Administration and Troubleshooting
  • AWS Cloud Operations (IAM EC2 Networking EKS)
  • Kubernetes Administration and Production Support
  • Terraform and Infrastructure as Code
  • CI/CD Tools (GitHub Actions Jenkins)
  • Monitoring & Observability (Datadog Prometheus Grafana SignalFx Nagios or similar)
  • Incident Management Root Cause Analysis and Production Support
  • Security Operations including vulnerability remediation access management and secrets rotation
  • Experience working in enterprise environments with formal change management processes

Preferred Skills

  • DNS Proxy Edge Services and Networking Platforms
  • Teleport Bastion Hosts Service Accounts and Access Management Solutions
  • Container Security and Supply Chain Security
  • AMI/Image Lifecycle Management
  • AI-enabled Operations Custom Agentic AI or Hyperscaler AI Services

Leadership Expectations

  • Lead a team of cloud/platform engineers.
  • Drive operational governance service reliability and process standardization.
  • Promote automation-first and reliability-first engineering practices.
  • Partner with stakeholders across Cloud Infrastructure Security and Application teams.

Nice to Have

  • Experience in SRE Platform Engineering or Managed Services environments.
  • Exposure to AI-powered operations observability or automation solutions.
  • Experience supporting large-scale distributed systems and cloud-native applications.

Required Experience:

IC

Cloud Operations Lead SRE / DevOps / Platform EngineeringExperience912 YearsShiftOverlap with US & EU Business HoursRole SummaryWe are seeking an experienced Cloud Operations Lead with a strong background in Site Reliability Engineering (SRE) DevOps and Platform Engineering. The ideal candidate wil...