Associate Staff Engineer, Devops

Nagarro

Not Interested
Bookmark
Report This Job

profile Job Location:

Mumbai - India

profile Monthly Salary: Not Disclosed
Posted on: 17 hours ago
Vacancies: 1 Vacancy

Department:

Engineering

Job Summary

Requirement:

  • Experience: 5 years
  • Strong experience in DevOps or Site Reliability Engineering (SRE) roles.
  • Strong knowledge of Docker Kubernetes Terraform and CI/CD pipelines.
  • Hands-on experience with AWS Azure or other cloud platforms.
  • Familiarity with GPU infrastructure and ML workloads is a plus.
  • Good understanding of monitoring and logging systems (Prometheus Grafana).
  • Ability to collaborate with ML teams for optimized inference and deployment.
  • Strong troubleshooting and problem-solving skills in high-scale environments.
  • Knowledge of infrastructure security best practices cost optimization and performance tuning.
  • Exposure to vector databases and AI/ML deployment pipelines is highly desirable.

Responsibilities:

  • Maintain and manage Kubernetes clusters AWS/Azure environments and GPU infrastructure for high-performance workloads.
  • Design and implement CI/CD pipelines for seamless deployments and faster release cycles.
  • Set up and maintain monitoring and logging systems using Prometheus and Grafana to ensure system health and reliability.
  • Support vector database scaling and model deployment for AI/ML workloads.
  • Collaborate with ML engineering teams to optimize inference performance and resource utilization.
  • Ensure high availability security and scalability of infrastructure across multiple environments.
  • Automate infrastructure provisioning and configuration using Terraform and other IaC tools.
  • Troubleshoot production issues and implement proactive measures to prevent downtime.
  • Continuously improve deployment processes and infrastructure reliability through automation and best practices.
  • Participate in architecture reviews capacity planning and disaster recovery strategies.
  • Drive cost optimization initiatives for cloud resources and GPU utilization.
  • Stay updated with emerging technologies in cloud-native AI infrastructure and DevOps automation.

Qualifications :

Bachelors or masters degree in computer science Information Technology or a related field


Remote Work :

No


Employment Type :

Full-time

Requirement:Experience: 5 yearsStrong experience in DevOps or Site Reliability Engineering (SRE) roles.Strong knowledge of Docker Kubernetes Terraform and CI/CD pipelines.Hands-on experience with AWS Azure or other cloud platforms.Familiarity with GPU infrastructure and ML workloads is a plus.Good u...
View more view more

Key Skills

  • Computer Science
  • Docker
  • Kubernetes
  • Python
  • VMware
  • C/C++
  • Go
  • System Architecture
  • gRPC
  • OS Kernels
  • Perl
  • Distributed Systems

About Company

Company Logo

Nagarro helps future-proof your business through a forward-thinking, fluidic, and CARING mindset. We excel at digital engineering and help our clients become human-centric, digital-first organizations, augmenting their ability to be responsive, efficient, intimate, creative, and susta ... View more

View Profile View Profile