GenAI Platform Support Engineer Infrastructure Cloud Experience DevOps Deployment Solutions Tools Techniques Terraform AWS Python Job Summary As a GenAI Platform Support Engineer you will provide technical expertise and support for our Generative AI platform You will work closely with cross functional teams to troubleshoot issues implement enhancements monitor system performance and ensure smooth efficient operation of the platform Key Responsibilities Assess and improve the resilience of the AI platforms data pipelines Ensure AI ML models deployed are fault tolerant scalable and optimized for real time use cases Identify and resolve bottlenecks in model inference and training workflows Collaborate with DevOps teams to enhance deployment processes and infrastructure Implement automated testing frameworks to validate fault tolerance and operational robustness Provide frontline technical support troubleshoot platform issues and respond to user queries promptly Monitor platform health using observability tools and apply optimization strategies proactively Document technical processes incidents and resolutions to build a strong knowledge base Stay current with emerging trends and advancements in generative AI technologies and cloud tooling Qualifications Bachelors degree in Computer Science Engineering or related field Proven experience in support and maintenance of complex platforms Strong knowledge of database management systems SQL NoSQL Hands on experience with cloud platforms such as AWS Azure or GCP Expertise in provisioning cloud services using Infrastructure as Code tools like Terraform Familiarity with OpenAI and generative AI cloud services is a plus Proficient in scripting languages including Python and Shell scripting Solid understanding of container orchestration using Kubernetes OpenShift preferred Experience troubleshooting services deployed in Kubernetes environments Strong knowledge of DevOps build and release pipelines and infrastructure requirements Familiarity with AI related tools such as LangChain LangSmith LiteLLM UiPath AI Center Azure OpenAI platform is beneficial Skilled in observability and monitoring tools like Prometheus Grafana CloudWatch and Azure OpenAI monitoring with dashboard setup capabilities Experienced with GitHub Actions Jenkins and Ansible automation frameworks Knowledge of cloud security and compliance best practices including IAM roles and secrets management Vault AWS Secrets Manager Excellent problem solving abil.
GenAI Platform Support Engineer Infrastructure Cloud Experience DevOps Deployment Solutions Tools Techniques Terraform AWS Python Job Summary As a GenAI Platform Support Engineer you will provide technical expertise and support for our Generative AI platform You will work closely with cross function...
GenAI Platform Support Engineer Infrastructure Cloud Experience DevOps Deployment Solutions Tools Techniques Terraform AWS Python Job Summary As a GenAI Platform Support Engineer you will provide technical expertise and support for our Generative AI platform You will work closely with cross functional teams to troubleshoot issues implement enhancements monitor system performance and ensure smooth efficient operation of the platform Key Responsibilities Assess and improve the resilience of the AI platforms data pipelines Ensure AI ML models deployed are fault tolerant scalable and optimized for real time use cases Identify and resolve bottlenecks in model inference and training workflows Collaborate with DevOps teams to enhance deployment processes and infrastructure Implement automated testing frameworks to validate fault tolerance and operational robustness Provide frontline technical support troubleshoot platform issues and respond to user queries promptly Monitor platform health using observability tools and apply optimization strategies proactively Document technical processes incidents and resolutions to build a strong knowledge base Stay current with emerging trends and advancements in generative AI technologies and cloud tooling Qualifications Bachelors degree in Computer Science Engineering or related field Proven experience in support and maintenance of complex platforms Strong knowledge of database management systems SQL NoSQL Hands on experience with cloud platforms such as AWS Azure or GCP Expertise in provisioning cloud services using Infrastructure as Code tools like Terraform Familiarity with OpenAI and generative AI cloud services is a plus Proficient in scripting languages including Python and Shell scripting Solid understanding of container orchestration using Kubernetes OpenShift preferred Experience troubleshooting services deployed in Kubernetes environments Strong knowledge of DevOps build and release pipelines and infrastructure requirements Familiarity with AI related tools such as LangChain LangSmith LiteLLM UiPath AI Center Azure OpenAI platform is beneficial Skilled in observability and monitoring tools like Prometheus Grafana CloudWatch and Azure OpenAI monitoring with dashboard setup capabilities Experienced with GitHub Actions Jenkins and Ansible automation frameworks Knowledge of cloud security and compliance best practices including IAM roles and secrets management Vault AWS Secrets Manager Excellent problem solving abil.
View more
View less