Role Overview:
You will play a critical role in bringing our AI initiatives for products to life ensuring they
are not just functional but production-grade secure and maintainable. This role
requires a unique blend of Machine Learning Engineering expertise MLOps best
practices and a deep understanding of the practical challenges in deploying generative
AI systems in a secure enterprise environment.
Key Responsibilities:
Production Application Development: Lead the end-to-end lifecycle of LLM
applications transitioning functional prototypes into robust scalable and
resilient production systems.
LLM API Integration & Orchestration: Design and implement robust integrations
with various LLM APIs (e.g. OpenAI Anthropic internal models) optimizing
performance cost and reliability.
Prompt Engineering & Optimization: Develop test and refine advanced prompt
engineering techniques to ensure accurate relevant and reliable model outputs
tailored to specific business use cases.
Context Management & RAG Implementation: Implement strategies for effective
context management including Retrieval-Augmented Generation (RAG) systems
vector databases and memory structures to enhance model relevance and
accuracy.
Output Validation & Quality Assurance: Establish rigorous validation frameworks
to automatically check and verify LLM outputs against predefined constraints
minimizing hallucinations and ensuring compliance with quality standards.
AI Security & Risk Mitigation: Implement robust security protocols to protect
against adversarial attacks specifically focusing on prompt injection indirect
prompt injection and SQL injection vulnerabilities within the LLM application
stack.
Production Deployment & Monitoring: Utilize MLOps principles to deploy
applications across cloud infrastructures (e.g. AWS GCP Azure) setting up
comprehensive monitoring for performance metrics latency token usage and
drift using tools like MLflow Weights & Biases or Prometheus.
Required Skills and Qualifications:
Experience: 5 years of professional experience as an ML Engineer or MLOps
Engineer with significant experience specifically focused on deploying LLM
applications into production environments (beyond just demos).
Technical Proficiency:
o Strong programming skills in Python.
o Hands-on experience with ML frameworks (e.g. PyTorch TensorFlow) and
orchestration tools (e.g. Kubeflow Airflow).
o Proficiency with cloud platforms (AWS GCP or Azure) and
containerization technologies (Docker Kubernetes).
o Experience with vector databases (e.g. Pinecone Weaviate Chroma) and
RAG architecture patterns.
o Familiarity with MLOps tools for tracking deployment and monitoring.
LLM Domain Knowledge: Deep understanding of current LLM capabilities
limitations prompt engineering best practices and emerging security
vulnerabilities in generative AI.
Problem-Solving: Strong analytical skills with a proactive approach to
troubleshooting complex production issues related to model performance
latency and system stability.
Communication: Excellent collaboration and communication skills capable of
working effectively within cross-functional teams (Data Scientists Software
Engineers Security Teams).
Required Skills:
Proactive Analytical Skill Clo Cro Cto Azure Technical Proficiency Erp Ned Quality Assurance Programming Skill Scala Compliance Machine Learning Application Development Python Chro Problem-solving Excel Communication Skill Cloud Platforms Aws Communication Skills Sql Strong Analytical Docker
Role Overview:You will play a critical role in bringing our AI initiatives for products to life ensuring theyare not just functional but production-grade secure and maintainable. This rolerequires a unique blend of Machine Learning Engineering expertise MLOps bestpractices and a deep understanding ...
Role Overview:
You will play a critical role in bringing our AI initiatives for products to life ensuring they
are not just functional but production-grade secure and maintainable. This role
requires a unique blend of Machine Learning Engineering expertise MLOps best
practices and a deep understanding of the practical challenges in deploying generative
AI systems in a secure enterprise environment.
Key Responsibilities:
Production Application Development: Lead the end-to-end lifecycle of LLM
applications transitioning functional prototypes into robust scalable and
resilient production systems.
LLM API Integration & Orchestration: Design and implement robust integrations
with various LLM APIs (e.g. OpenAI Anthropic internal models) optimizing
performance cost and reliability.
Prompt Engineering & Optimization: Develop test and refine advanced prompt
engineering techniques to ensure accurate relevant and reliable model outputs
tailored to specific business use cases.
Context Management & RAG Implementation: Implement strategies for effective
context management including Retrieval-Augmented Generation (RAG) systems
vector databases and memory structures to enhance model relevance and
accuracy.
Output Validation & Quality Assurance: Establish rigorous validation frameworks
to automatically check and verify LLM outputs against predefined constraints
minimizing hallucinations and ensuring compliance with quality standards.
AI Security & Risk Mitigation: Implement robust security protocols to protect
against adversarial attacks specifically focusing on prompt injection indirect
prompt injection and SQL injection vulnerabilities within the LLM application
stack.
Production Deployment & Monitoring: Utilize MLOps principles to deploy
applications across cloud infrastructures (e.g. AWS GCP Azure) setting up
comprehensive monitoring for performance metrics latency token usage and
drift using tools like MLflow Weights & Biases or Prometheus.
Required Skills and Qualifications:
Experience: 5 years of professional experience as an ML Engineer or MLOps
Engineer with significant experience specifically focused on deploying LLM
applications into production environments (beyond just demos).
Technical Proficiency:
o Strong programming skills in Python.
o Hands-on experience with ML frameworks (e.g. PyTorch TensorFlow) and
orchestration tools (e.g. Kubeflow Airflow).
o Proficiency with cloud platforms (AWS GCP or Azure) and
containerization technologies (Docker Kubernetes).
o Experience with vector databases (e.g. Pinecone Weaviate Chroma) and
RAG architecture patterns.
o Familiarity with MLOps tools for tracking deployment and monitoring.
LLM Domain Knowledge: Deep understanding of current LLM capabilities
limitations prompt engineering best practices and emerging security
vulnerabilities in generative AI.
Problem-Solving: Strong analytical skills with a proactive approach to
troubleshooting complex production issues related to model performance
latency and system stability.
Communication: Excellent collaboration and communication skills capable of
working effectively within cross-functional teams (Data Scientists Software
Engineers Security Teams).
Required Skills:
Proactive Analytical Skill Clo Cro Cto Azure Technical Proficiency Erp Ned Quality Assurance Programming Skill Scala Compliance Machine Learning Application Development Python Chro Problem-solving Excel Communication Skill Cloud Platforms Aws Communication Skills Sql Strong Analytical Docker