Cloud & AI Infrastructure Engineer | AWS, Azure, Multi-LLM Deployment, Kubernetes, Terraform, Security & Observability

Synechron

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Job Summary
Synechron is seeking a highly skilled Agentic AI Platform Engineer to support the deployment management and optimization of cloud infrastructure for multi-model AI this role you will design and maintain scalable secure and resilient AI platform components on AWS including Bedrock EKS and Redshift. Your expertise will enable efficient AI model routing data integration and observability critical to the organizations AI and digital transformation strategies. You will collaborate with cross-functional teams to develop robust APIs automate deployment workflows and ensure compliance with enterprise standards.

Software Requirements

  • Required: AWS Bedrock EKS EC2 Terraform (latest stable) Jenkins CloudWatch X-Ray Splunk New Relic PostgreSQL Redshift REST SOAP GraphQL OAuth2 streaming protocols

  • Preferred: AWS Nova OpenAI Anthropic Gemini Kubernetes Helm monitoring tools (Datadog Instana) enterprise security tools (SCPs IAM roles) multi-LLM routing platforms

  • Experience level: 5 years of AWS cloud platform engineering with a focus on large-scale AI infrastructure and microservice deployment

Overall Responsibilities

  • Design deploy and manage cloud infrastructure supporting multi-tenant fault-tolerant AI platforms on AWS including Bedrock and EKS clusters

  • Develop and maintain infrastructure as code (Terraform modules CloudFormation templates) for scalable and repeatable deployments

  • Configure and optimize gateways such as Kong MCP servers and multi-LLM routing architectures

  • Implement comprehensive observability using CloudWatch X-Ray Splunk and New Relic to track system health SLA adherence and incident detection

  • Automate deployment workflows and CI/CD pipelines to accelerate model updates and platform releases

  • Support data infrastructure including PostgreSQL and Redshift for RAG data storage and retrieval workflows

  • Collaborate with AI/ML teams security and network teams to ensure compliant high-security deployments

  • Conduct root cause analysis of incidents optimize platform performance and implement preventive measures

  • Document system architecture platform APIs and operational procedures ensuring adherence to enterprise standards

Technical Skills (By Category)

  • Programming Languages:

    • Essential: Terraform Python Shell scripting Azure CLI Azure PowerShell

    • Preferred: Java Go or other scripting languages for automation and integration

  • Cloud Technologies:

    • AWS (Bedrock EC2 EKS S3 Redshift IAM VPC) Azure (AKS Function Apps Azure SQL) GCP (optional)

  • Frameworks and Libraries:

    • Kubernetes Helm charts Istio or other service mesh tools monitoring SDKs (CloudWatch Datadog Prometheus)

  • Development Tools & Methodologies:

    • Terraform modules Jenkins Bitbucket/GitHub CI/CD pipelines Agile/Scrum Infrastructure as Code version control and automation best practices

  • Security & Compliance:

    • Managing IAM roles SCPs encryption standards network security policies and compliance standards (GDPR SOC HIPAA as applicable)

Experience Requirements

  • 5 years supporting cloud infrastructure and deployment of large-scale AI/ML workloads on AWS or multi-cloud environments

  • Proven experience deploying and optimizing multi-model LLM platforms such as Bedrock OpenAI or Anthropic

  • Strong expertise in infrastructure automation (Terraform CloudFormation) container orchestration (Kubernetes EKS)

  • Skilled in multi-LLM routing SLA management and incident response in AI systems

  • Experience working with data infrastructure like PostgreSQL and Redshift for RAG workflows

  • Industry experience in enterprise AI fintech or cloud-native architectures is preferred; equivalent enterprise infrastructure experience is acceptable

Day-to-Day Activities

  • Design manage and optimize cloud infrastructure supporting multi-tenant AI workloads

  • Automate deployment and scaling of AI platform components pipelines and models

  • Configure and monitor gateways routing engines and security modules for enterprise-grade availability

  • Implement and improve observability frameworks to proactively detect and resolve issues

  • Support data infrastructure and model deployment workflows including data ingestion storage and retrieval

  • Conduct root cause analysis incident resolution and disaster recovery drills

  • Collaborate with AI/ML teams security and infrastructure teams to ensure compliance and security

  • Develop and maintain documentation on platform architecture API endpoints and operational procedures

  • Stay informed on emerging AI platform architectures cloud services and security practices

Qualifications

  • Bachelors or Masters degree in Computer Science Cloud Computing Data Science or related field

  • 5 years supporting cloud platform engineering AI infrastructure deployment and data workflows

  • Certifications in AWS Azure or GCP cloud solutions and security (preferred)

  • Proven experience managing multi-LLM deployment platforms and model routing architectures

  • Strong troubleshooting performance tuning and incident analysis skills

  • Excellent communication and collaboration abilities across technical teams

Professional Competencies

  • Critical thinking for designing scalable fault-tolerant platforms

  • Leadership and team management skills supporting cross-team collaboration

  • Effective communication for stakeholder engagement and documentation

  • Adaptability to evolving cloud and AI architecture trends

  • Ownership of platform stability security and continuous improvement

  • Time management proficiency to coordinate multiple deployment cycles and incident responses

SYNECHRONS DIVERSITY & INCLUSION STATEMENT

Diversity & Inclusion are fundamental to our culture and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity Equity and Inclusion (DEI) initiative Same Difference is committed to fostering an inclusive culture promoting equality diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger successful businesses as a global company. We encourage applicants from across diverse backgrounds race ethnicities religion age marital status gender sexual orientations or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements mentoring internal mobility learning and development programs and more.


All employment decisions at Synechron are based on business needs job requirements and individual qualifications without regard to the applicants gender gender identity sexual orientation race ethnicity disabled or veteran status or any other characteristic protected by law.

Candidate Application Notice


Required Experience:

IC

Job SummarySynechron is seeking a highly skilled Agentic AI Platform Engineer to support the deployment management and optimization of cloud infrastructure for multi-model AI this role you will design and maintain scalable secure and resilient AI platform components on AWS including Bedrock EKS and...
View more view more

About Company

Company Logo

Chez Synechron, nous croyons en la puissance du numérique pour transformer les entreprises en mieux. Notre cabinet de conseil mondial combine la créativité et la technologie innovante pour offrir des solutions numériques de premier plan. Les technologies progressistes et les stratégie ... View more

View Profile View Profile