Overview:
TekWissen is a global workforce management provider throughout India and many other countries in the world.
Title:Cloud & DevOps Engineer (Infrastructure Platform)
Work Location:Bangalore
Job Type: Full time
Work Type: Onsite-Monda-Friday
Shift: UK Shift - 1:30 PM to 10:30 PM IST
Job Description:
- We are seeking a Cloud DevOps & MLOps Engineer with strong hands-on experience in cloud infrastructure automation CI/CD container platforms and machine learning platform operations.
- This role requires professionals who can own cloud environments end-to-end while also supporting AI/ML workloads model deployment pipelines and scalable AI infrastructure.
- The ideal candidate brings practical production experience in DevOps practices and ML platform enablement strong troubleshooting skills and the ability to improve operational maturity across cloud DevOps and MLOps practices.
- The role involves collaboration with data scientists ML engineers and application teams to enable scalable and reliable AI-powered solutions.
Key Responsibilities:
Cloud Infrastructure Ownership:
- Design provision and manage infrastructure workloads across AWS Azure or GCP environments
- Own lifecycle management of compute networking storage and platform services
- Support infrastructure required for AI/ML training inference and data pipelines
- Manage compute environments including GPU/accelerated workloads for machine learning
- Ensure infrastructure availability scalability and operational stability
- Implement infrastructure standards templates and reusable deployment patterns
Infrastructure as Code & Automation
- Develop and maintain infrastructure using Terraform or similar IaC tools
- Automate provisioning of environments for data science and ML experimentation
- Automate provisioning configuration and deployment workflows
CI/CD & Release Enablement
- Design and maintain robust CI/CD pipelines using GitHub Actions GitLab CI Azure DevOps or Jenkins
- Enable ML model CI/CD pipelines (MLOps) for model versioning validation and deployment
- Automate build test security scan and deployment pipelines for both applications and ML models
- Enable automated build test security scan and deployment pipelines
Containerization & Kubernetes
- Build deploy and manage containerized applications using Docker
- Support Kubernetes clusters for microservices and ML inference workloads
- Manage scalable deployment of AI model APIs
ML Platform Support
- Support infrastructure for machine learning workflows and model lifecycle
- Enable model training experiment tracking and model deployment pipelines
- Collaborate with data scientists and ML engineers to operationalize models
- Support frameworks such as: (MLFlow Kubeflow Azure ML SageMaker)
System Administration & Platform Reliability
- Manage Linux / Windows server environments including patching performance tuning and security hardening
- Support high availability environments for AI applications and data pipelines
- Participate in incident response root cause analysis and resolution activities
- Improve monitoring alerting and operational readiness practices
- Maintain documentation for infrastructure and operational runbooks
Security & Access Management
- Implement IAM policies RBAC controls and secure access models
- Secure ML pipelines and data access
- Ensure secure handling of secrets certificates and credentials
Required Qualifications:
- Bachelors degree in Computer Science Engineering or related field
- 6-12 years of experience in Cloud Engineering DevOps Infrastructure Engineering or Platform Support roles
- Strong hands-on experience with at least one public cloud (AWS / Azure / GCP)
- Proven experience implementing Infrastructure as Code using Terraform
- Experience building and maintaining CI/CD pipelines
- Hands-on exposure to Docker and Kubernetes environments
- Strong scripting skills (Bash / Python / PowerShell)
- Understanding of cloud infrastructure for AI workloads
Preferred Experience
- Experience supporting multi-region or multi-environment cloud deployments
- Exposure to cloud monitoring tools such as CloudWatch Azure Monitor Prometheus Grafana
- Understanding of model deployment pipeline
- Experience with vector databases or AI workloads
- Understanding of cost optimization and cloud governance practices
- Experience working in global delivery or production support environments
- Exposure to platform engineering or SRE practices
Certifications (Preferred)
- AWS Associate / Azure Administrator / GCP Associate Cloud Engineer
- Terraform Associate Certification
- Kubernetes and Cloud Native Associate (KCNA) or CKA
- CompTIA Security
- Linux Foundation Certification (LFCS / LFCE)
Key Competencies:
- Strong ownership mindset and execution discipline
- Ability to troubleshoot complex infrastructure issues
- Structured thinking and documentation capability
- Collaboration with distributed global teams
- Continuous learning and improvement mindset
Work Environment:
- Structured office-based engineering collaboration
- Exposure to AI platforms ML pipelines and production AI deployments
- Participation in incident troubleshooting and operational reviews
- Adherence to enterprise security and compliance standards
TekWissen Group is an equal opportunity employer supporting workforce diversity.
Overview: TekWissen is a global workforce management provider throughout India and many other countries in the world. Title:Cloud & DevOps Engineer (Infrastructure Platform) Work Location:Bangalore Job Type: Full time Work Type: Onsite-Monda-Friday Shift: UK Shift - 1:30 PM to 10:30 PM I...
Overview:
TekWissen is a global workforce management provider throughout India and many other countries in the world.
Title:Cloud & DevOps Engineer (Infrastructure Platform)
Work Location:Bangalore
Job Type: Full time
Work Type: Onsite-Monda-Friday
Shift: UK Shift - 1:30 PM to 10:30 PM IST
Job Description:
- We are seeking a Cloud DevOps & MLOps Engineer with strong hands-on experience in cloud infrastructure automation CI/CD container platforms and machine learning platform operations.
- This role requires professionals who can own cloud environments end-to-end while also supporting AI/ML workloads model deployment pipelines and scalable AI infrastructure.
- The ideal candidate brings practical production experience in DevOps practices and ML platform enablement strong troubleshooting skills and the ability to improve operational maturity across cloud DevOps and MLOps practices.
- The role involves collaboration with data scientists ML engineers and application teams to enable scalable and reliable AI-powered solutions.
Key Responsibilities:
Cloud Infrastructure Ownership:
- Design provision and manage infrastructure workloads across AWS Azure or GCP environments
- Own lifecycle management of compute networking storage and platform services
- Support infrastructure required for AI/ML training inference and data pipelines
- Manage compute environments including GPU/accelerated workloads for machine learning
- Ensure infrastructure availability scalability and operational stability
- Implement infrastructure standards templates and reusable deployment patterns
Infrastructure as Code & Automation
- Develop and maintain infrastructure using Terraform or similar IaC tools
- Automate provisioning of environments for data science and ML experimentation
- Automate provisioning configuration and deployment workflows
CI/CD & Release Enablement
- Design and maintain robust CI/CD pipelines using GitHub Actions GitLab CI Azure DevOps or Jenkins
- Enable ML model CI/CD pipelines (MLOps) for model versioning validation and deployment
- Automate build test security scan and deployment pipelines for both applications and ML models
- Enable automated build test security scan and deployment pipelines
Containerization & Kubernetes
- Build deploy and manage containerized applications using Docker
- Support Kubernetes clusters for microservices and ML inference workloads
- Manage scalable deployment of AI model APIs
ML Platform Support
- Support infrastructure for machine learning workflows and model lifecycle
- Enable model training experiment tracking and model deployment pipelines
- Collaborate with data scientists and ML engineers to operationalize models
- Support frameworks such as: (MLFlow Kubeflow Azure ML SageMaker)
System Administration & Platform Reliability
- Manage Linux / Windows server environments including patching performance tuning and security hardening
- Support high availability environments for AI applications and data pipelines
- Participate in incident response root cause analysis and resolution activities
- Improve monitoring alerting and operational readiness practices
- Maintain documentation for infrastructure and operational runbooks
Security & Access Management
- Implement IAM policies RBAC controls and secure access models
- Secure ML pipelines and data access
- Ensure secure handling of secrets certificates and credentials
Required Qualifications:
- Bachelors degree in Computer Science Engineering or related field
- 6-12 years of experience in Cloud Engineering DevOps Infrastructure Engineering or Platform Support roles
- Strong hands-on experience with at least one public cloud (AWS / Azure / GCP)
- Proven experience implementing Infrastructure as Code using Terraform
- Experience building and maintaining CI/CD pipelines
- Hands-on exposure to Docker and Kubernetes environments
- Strong scripting skills (Bash / Python / PowerShell)
- Understanding of cloud infrastructure for AI workloads
Preferred Experience
- Experience supporting multi-region or multi-environment cloud deployments
- Exposure to cloud monitoring tools such as CloudWatch Azure Monitor Prometheus Grafana
- Understanding of model deployment pipeline
- Experience with vector databases or AI workloads
- Understanding of cost optimization and cloud governance practices
- Experience working in global delivery or production support environments
- Exposure to platform engineering or SRE practices
Certifications (Preferred)
- AWS Associate / Azure Administrator / GCP Associate Cloud Engineer
- Terraform Associate Certification
- Kubernetes and Cloud Native Associate (KCNA) or CKA
- CompTIA Security
- Linux Foundation Certification (LFCS / LFCE)
Key Competencies:
- Strong ownership mindset and execution discipline
- Ability to troubleshoot complex infrastructure issues
- Structured thinking and documentation capability
- Collaboration with distributed global teams
- Continuous learning and improvement mindset
Work Environment:
- Structured office-based engineering collaboration
- Exposure to AI platforms ML pipelines and production AI deployments
- Participation in incident troubleshooting and operational reviews
- Adherence to enterprise security and compliance standards
TekWissen Group is an equal opportunity employer supporting workforce diversity.
View more
View less