Job Title: AI Sr. Engineer LLMOps & MLOps
Location (On-site Remote or Hybrid): Memphis TN (remote)
Contract Duration: Direct Hire
Job Description
Role Overview
This is a high-stakes execution-focused role within the Transformation Office. We are looking for a day-one engineer to own the production lifecycle of our AI initiatives. Your mission is to build the automated infrastructure that bridges our legacy data systems with modern AWS and Azure AI services. You will be responsible for the Ops of AI: ensuring that LLM applications RAG pipelines and traditional ML models are deployable observable and scalable in a multi-cloud environment.
Key Responsibilities
- Multi-Cloud Pipeline Execution: Build and maintain automated CI/CD and CT (Continuous Training) pipelines across AWS (SageMaker/Bedrock) and Azure (AI Studio).
- LLMOps Framework Implementation: Design and execute the infrastructure for Retrieval-Augmented Generation (RAG) including vector database management (OpenSearch Pinecone or Azure AI Search) and semantic index optimization.
- Legacy Data Connectivity: Build the engineering pipes to securely ingest and move data from legacy systems (Mainframes SQL Server on-prem DBs) into cloud-native MLOps workflows.
- Automated Model Evaluation: Implement systemized frameworks for LLM evaluation (LLM-as-a-judge ROUGE METEOR) and traditional ML validation to ensure performance before deployment.
- Observability & Monitoring: Deploy real-time monitoring for model drift hallucination detection latency and token consumption to manage both quality and cost.
- Infrastructure as Code (IaC): Manage all AI resources using Terraform or CloudFormation ensuring the cloud posture is reproducible secure and follows a Privacy by Design mandate.
- Advanced Analytics Integration: Partner with teams using platforms like Palantir Databricks or Snowflake to ensure a high-fidelity data flow between analytical ontologies and production models.
- IT & Security Diplomacy: Work directly with central IT and Security to navigate IAM roles VPC peering and firewall configurations clearing the path for rapid transformation.
- Scalable Inference Engineering: Optimize model serving endpoints for high-throughput and low-latency utilizing containerization (Docker/Kubernetes) and serverless architectures where appropriate.
- Prompt & Model Versioning: Establish rigorous version control for prompts (PromptOps) model weights and data snapshots to ensure 100% auditability and rollback capability.
- Data Science Engineering: Support the data science lifecycle by automating feature stores feature engineering pipelines and the transition of experimental notebooks into hardened production microservices.
- Security & Compliance Hardening: Implement automated scanning and guardrails (e.g. Bedrock Guardrails or Azure Content Safety) to prevent prompt injection and data leakage.
Qualifications
- Education: Bachelor s degree in Computer Science or a related field required; Master s degree in a quantitative discipline highly desirable.
- Proven Execution: 6 years of engineering experience with a minimum of 3 years strictly focused on MLOps or LLMOps in a production environment.
- AWS & Azure Mastery: Deep hands-on proficiency in both ecosystems. You must be able to configure Bedrock and Azure OpenAI services including private networking and endpoint security on day one.
- Technical Stack: Expert Python SQL and PySpark. Extensive experience with containerization (Docker Kubernetes) and orchestration tools (Airflow Kubeflow or Step Functions).
- LLM Tooling: Professional experience with evaluation and observability frameworks like LangSmith Arize Phoenix or WhyLabs.
- Data Science Flavor: A strong understanding of statistical validation model evaluation metrics and the ability to partner with Data Scientists to optimize model performance.
- Transformation Mindset: The ability to move at the speed of a startup while maintaining the collaborative relationships required to function within a large-scale enterprise IT landscape.
Job Title: AI Sr. Engineer LLMOps & MLOps Location (On-site Remote or Hybrid): Memphis TN (remote) Contract Duration: Direct Hire Job Description Role Overview This is a high-stakes execution-focused role within the Transformation Office. We are looking for a day-one engineer to own the pr...
Job Title: AI Sr. Engineer LLMOps & MLOps
Location (On-site Remote or Hybrid): Memphis TN (remote)
Contract Duration: Direct Hire
Job Description
Role Overview
This is a high-stakes execution-focused role within the Transformation Office. We are looking for a day-one engineer to own the production lifecycle of our AI initiatives. Your mission is to build the automated infrastructure that bridges our legacy data systems with modern AWS and Azure AI services. You will be responsible for the Ops of AI: ensuring that LLM applications RAG pipelines and traditional ML models are deployable observable and scalable in a multi-cloud environment.
Key Responsibilities
- Multi-Cloud Pipeline Execution: Build and maintain automated CI/CD and CT (Continuous Training) pipelines across AWS (SageMaker/Bedrock) and Azure (AI Studio).
- LLMOps Framework Implementation: Design and execute the infrastructure for Retrieval-Augmented Generation (RAG) including vector database management (OpenSearch Pinecone or Azure AI Search) and semantic index optimization.
- Legacy Data Connectivity: Build the engineering pipes to securely ingest and move data from legacy systems (Mainframes SQL Server on-prem DBs) into cloud-native MLOps workflows.
- Automated Model Evaluation: Implement systemized frameworks for LLM evaluation (LLM-as-a-judge ROUGE METEOR) and traditional ML validation to ensure performance before deployment.
- Observability & Monitoring: Deploy real-time monitoring for model drift hallucination detection latency and token consumption to manage both quality and cost.
- Infrastructure as Code (IaC): Manage all AI resources using Terraform or CloudFormation ensuring the cloud posture is reproducible secure and follows a Privacy by Design mandate.
- Advanced Analytics Integration: Partner with teams using platforms like Palantir Databricks or Snowflake to ensure a high-fidelity data flow between analytical ontologies and production models.
- IT & Security Diplomacy: Work directly with central IT and Security to navigate IAM roles VPC peering and firewall configurations clearing the path for rapid transformation.
- Scalable Inference Engineering: Optimize model serving endpoints for high-throughput and low-latency utilizing containerization (Docker/Kubernetes) and serverless architectures where appropriate.
- Prompt & Model Versioning: Establish rigorous version control for prompts (PromptOps) model weights and data snapshots to ensure 100% auditability and rollback capability.
- Data Science Engineering: Support the data science lifecycle by automating feature stores feature engineering pipelines and the transition of experimental notebooks into hardened production microservices.
- Security & Compliance Hardening: Implement automated scanning and guardrails (e.g. Bedrock Guardrails or Azure Content Safety) to prevent prompt injection and data leakage.
Qualifications
- Education: Bachelor s degree in Computer Science or a related field required; Master s degree in a quantitative discipline highly desirable.
- Proven Execution: 6 years of engineering experience with a minimum of 3 years strictly focused on MLOps or LLMOps in a production environment.
- AWS & Azure Mastery: Deep hands-on proficiency in both ecosystems. You must be able to configure Bedrock and Azure OpenAI services including private networking and endpoint security on day one.
- Technical Stack: Expert Python SQL and PySpark. Extensive experience with containerization (Docker Kubernetes) and orchestration tools (Airflow Kubeflow or Step Functions).
- LLM Tooling: Professional experience with evaluation and observability frameworks like LangSmith Arize Phoenix or WhyLabs.
- Data Science Flavor: A strong understanding of statistical validation model evaluation metrics and the ability to partner with Data Scientists to optimize model performance.
- Transformation Mindset: The ability to move at the speed of a startup while maintaining the collaborative relationships required to function within a large-scale enterprise IT landscape.
View more
View less