SRE DevOps engineer (with Python and ML frameworks)

N-iX

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 01-09-2025
Vacancies: 1 Vacancy

Job Summary

N-iX is a global software development service company that helps businesses across the world develop successful software products. Founded in 2002 N-iX has come a long way expanding its presence across Europe the US and Latin America. Today we are a strong community of 2000 professionals and a reliable partner for global industry leaders and Fortune 500 companies.

Our client is a global commerce leader where you can influence how the world buys sells and gives. Youll be part of a work culture thats been genuinely committed to diversity and inclusion since its founding over twenty five years ago. Here you can be yourself do your best work along with a team of professionals and have a meaningful impact on people across the globe. We seek people with drive ideas and a passion for helping small businesses succeed to help.

About the team:
We are the AI Platform Team providing highly available scalable and automated machine learning infrastructure for researchers and data scientists globally. We are looking for a motivated self-reliant SRE / DevOps engineer with Python and ML framework experience to drive operational excellence automation and platform reliability.

Role Overview:
This role focuses on maintaining deploying and improving AI/ML platform services with strong emphasis on DevOps SRE practices and automation. You will collaborate closely with developers researchers and infrastructure teams to ensure robust scalable and highly available ML systems.

Responsibilities:

DevOps (60%):

  • Design implement and maintain CI/CD pipelines for AI/ML platform services.
  • Manage and troubleshoot Kubernetes clusters Docker containers and cloud infrastructure.
  • Ensure high availability (99.999%) system reliability and security across platforms.
  • Automate operational tasks monitoring and deployment workflows.
  • Collaborate with AI platform developers to deploy and scale ML frameworks efficiently.
  • Analyze and resolve production issues performance bottlenecks and functional problems.
  • Define operational standards versioning practices and advise teams on DevOps best practices.
  • Prepare documentation training materials and provide technical support to platform users.

Development (40%):

  • Design build and refactor Python services and ML framework integrations.
  • Work with ML frameworks such as PyTorch TensorFlow and Triton.
  • Handle framework-related issues version upgrades and environment compatibility.
  • Work with Ray ecosystem libraries such as Ray Train Ray Tune Ray Serve Ray Data.
  • Integrate Ray with tools such as Airflow MLflow Dask DeepSpeed (plus).
  • Support AI/ML model training inferencing platforms and LLM fine-tuning systems.
  • Collaborate with developers to integrate ML pipelines into automated CI/CD workflows.

Requirements:

  • Strong Python development experience (24 years).
  • Overall 35 years of relevant DevOps / SRE experience.
  • Hands-on experience with ML frameworks (PyTorch TensorFlow Triton).
  • Hands-on experience with : cluster deployment workload management distributed task scheduling.
  • Familiarity with Ray ecosystem libraries (Train Tune Serve Data) and integration with ML tooling.
  • Experience with AI/ML model training and inferencing platforms is a plus.
  • Familiarity with LLM fine-tuning systems is a plus.
  • Solid understanding of Kubernetes Docker Linux fundamentals and DevOps practices.
  • Experience with CI/CD pipelines (Jenkins or similar) test automation and monitoring.
  • Strong debugging and triaging skills.
  • Excellent communication and collaboration skills with cross-functional teams.
  • Strong organizational skills to manage multiple projects in a fast-paced environment.
  • Fluent in English (spoken and written).

We offer*:

  • Flexible working format - remote office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program tech talks and trainings centers of excellence and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits

*not applicable for freelancers

N-iX is a global software development service company that helps businesses across the world develop successful software products. Founded in 2002 N-iX has come a long way expanding its presence across Europe the US and Latin America. Today we are a strong community of 2000 professionals and a relia...
View more view more

Key Skills

  • APIs
  • Docker
  • Jenkins
  • REST
  • Python
  • AWS
  • NoSQL
  • MySQL
  • JavaScript
  • Postgresql
  • Django
  • GIT

About Company

Company Logo

N-iX is a global software development company that helps world’s leading organizations achieve lasting business value using advanced technology.

View Profile View Profile