Technical Support Manager Cloud SRE

Datavail Infotech

Not Interested
Bookmark
Report This Job

profile Job Location:

Mumbai - India

profile Monthly Salary: Not Disclosed
Posted on: 2 days ago
Vacancies: 1 Vacancy

Job Summary

Description

Job Title: Technical Support Manager SRE (Cloud Managed Services)

Education: Any Graduate

Experience: 12years

Location: Mumbai

Job Description:

Role Overview:

We are seeking an experienced SRE Support Manager to lead multi-cloud managed services support operations across Amazon Web Services Microsoft Azure and Google Cloud environments. This role will be responsible for ensuring platform reliability operational excellence SLA governance and customer satisfaction while managing Level 1 and Level 2 SRE engineers and collaborating with Level 3 engineering teams.

The ideal candidate combines strong people leadership customer management cloud operations expertise and deep understanding of Site Reliability Engineering practices including SLI SLO SLA error budgets observability automation and incident management.

Experience Required:

12 years overall experience with 3 years in team leadership / support management / SRE management role.

Key Responsibilities:

Team Leadership & Support Operations:

  • Lead mentor and develop Level 1 and Level 2 SRE Support Engineers.

  • Manage 24x7 support coverage shift planning workforce utilization and operational readiness.

  • Establish clear escalation matrices and support ownership models.

  • Drive skill upliftment across cloud technologies troubleshooting and SRE practices.

Customer & Service Delivery Management:

  • Manage support delivery for multiple enterprise managed services customers.

  • Understand customer expectations business priorities and critical workloads.

  • Act as senior escalation point for high-priority incidents and service concerns.

  • Ensure proactive communication during outages incidents and service requests.

Reliability Engineering & SRE Governance:

  • Define and monitor Service Level Indicators (SLIs) for availability latency error rates throughput and ticket responsiveness.

  • Establish and govern Service Level Objectives (SLOs) aligned to customer needs.

  • Manage Error Budgets and balance reliability with speed of change.

  • Improve operational reliability through automation standardization and continuous improvement.

  • Reduce toil and repetitive manual support tasks.

Incident / Problem / Change Management:

  • Lead major incident management bridges and restoration activities.

  • Coordinate with Level 3 teams cloud vendors and customer stakeholders.

  • Drive Root Cause Analysis (RCA) and preventive corrective actions.

  • Ensure controlled execution of change management patching releases and maintenance.

SLA / KPI / Reporting:

  • Track contractual SLAs operational KPIs MTTR MTTD ticket aging and backlog health.

  • Publish weekly/monthly service review dashboards.

  • Highlight risks recurring issues and improvement opportunities.

  • Ensure audit readiness and governance compliance.

Multi-Cloud Platform Management:

  • Oversee customer workloads on:

  • Amazon Web Services - EC2 RDS EKS Lambda IAM VPC CloudWatch

  • Microsoft Azure- Azure VM AKS Azure SQL VNets Monitor Defender

  • Google Cloud- Compute Engine GKE Cloud SQL IAM Operations Suite

Required Technical Skills:

Cloud & Infrastructure

  • Strong hands-on experience in any one or more cloud platforms: Amazon Web Services / Microsoft Azure / Google Cloud

  • Good understanding of compute storage networking IAM backup DR and security controls.

  • Experience with Linux and/or Windows server administration.

  • Knowledge of containers and orchestration platforms such as Kubernetes / Docker.

SRE & Reliability Engineering

  • Strong knowledge of SRE principles and best practices.

  • Experience designing and tracking SLI SLO SLA frameworks.

  • Practical understanding of Error Budget policy management.

  • Expertise in incident response on-call operations postmortems and resilience engineering.

  • Familiarity with capacity planning availability engineering and performance optimization.

Monitoring / Observability

  • Hands-on experience with:

  • Amazon CloudWatch

  • Azure Monitor

  • Google Cloud Operations Suite

  • Datadog

  • Grafana

  • Prometheus

Automation / DevOps

  • Experience with scripting: Python / Bash / PowerShell.

  • Infrastructure as Code using Terraform or similar.

  • CI/CD exposure using GitHub Actions Jenkins or similar tools.

Leadership Skills

  • Proven experience managing technical support or SRE operations teams.

  • Strong customer-facing communication skills.

  • Ability to manage escalations under pressure.

  • Strong decision-making and stakeholder management skills.

Preferred Qualifications

  • ITIL Foundation / ITSM knowledge.

  • AWS / Azure / GCP certifications.

  • Experience in Managed Services / MSP environment.

  • Experience leading 24x7 global support teams.

Success Metrics

  • SLA / SLO attainment

  • Error budget compliance

  • MTTR reduction

  • Service availability improvement

  • Customer satisfaction (CSAT)

  • Ticket backlog health

  • Automation delivered

  • Team productivity and retention




Required Experience:

Manager

DescriptionJob Title: Technical Support Manager SRE (Cloud Managed Services)Education: Any GraduateExperience: 12yearsLocation: MumbaiJob Description:Role Overview:We are seeking an experienced SRE Support Manager to lead multi-cloud managed services support operations across Amazon Web Services Mic...
View more view more

About Company

Company Logo

Datavail is a leading provider of data management, application development, analytics, and cloud services, with more than 1,000 professionals helping clients build and manage applications and data via a world-class tech-enabled delivery platform and software solutions across all leadi ... View more

View Profile View Profile