Job Description: Cloud Infrastructure Engineer (AWS & Azure AI/ML Focus)
Overview
We are seeking a highly skilled Cloud Infrastructure Engineer with 810 years of experience designing building and supporting large-scale cloud platforms across AWS and Azure. This role focuses on enabling AI/ML workloads building scalable infrastructure and implementing DevOps-driven automation using CDK CDKTF and Terraform.
Key Responsibilities
1. Cloud Infrastructure Design & Engineering
Work closely with a team of Cloud Engineers to design build and maintain robust scalable AWS infrastructure.
Architect and provision AWS platform services required to support AI and Machine Learning applications.
Develop end-to-end cloud solutions with a strong emphasis on scalability reliability and performance.
2. Collaboration with Development & Product Teams
Partner with development teams to analyze technical requirements and translate them into cloud architecture and platform services.
Work with Product Managers to design tools that support:
Experimentation environments
ML model training workflows
Production-grade ML operations
3. Data & DevOps Engineering
Design and provision end-to-end data solutions enabled by DevOps practices using AWS CDK CDKTF or Terraform.
Build scalable low-latency systems and implement capacity planning frameworks.
Design and manage data pipelines ensuring efficiency and availability.
4. Reliability Resilience & Disaster Recovery
Implement disaster recovery strategies and highly available architectures for critical cloud workloads.
Ensure systems meet reliability security and operational excellence standards.
5. Engineering Excellence & Delivery
Promote software engineering best practices including:
Highquality coding standards
Rigorous code reviews
Strong documentation practices
Identify and resolve technical challenges proactively.
Prioritize and balance projects to meet organizational goals efficiently.
MustHave Skills & Qualifications
Cloud Platforms
Strong experience architecting provisioning and managing AWS cloud infrastructure.
Hands-on experience deploying and supporting Azure cloud solutions.
Development & Automation
Proficiency in at least one programming language:
TypeScript
Python (including Boto3 for AWS automation)
Experience building CI/CD pipelines (e.g. AWS CodePipeline Azure DevOps GitHub Actions).
Infrastructure as Code
Hands-on experience with:
AWS CDK
CDK for Terraform (CDKTF)
Terraform
AI/ML & Azure Data Services
Experience configuring and deploying Azure AI & Data services such as:
Microsoft OpenAI / Azure OpenAI
Azure AI Search
Azure AI Translator
Power BI
Azure Data Lake Storage
Azure Databricks
Azure Data Factory
Azure Machine Learning
Azure SQL Database
Containerization
Experience with Docker and container-based deployment workflows.
Problem Solving & Communication
Strong decision-making technical consulting and problem-solving abilities.
Ability to collaborate effectively with internal and external stakeholders.