AI Eval Testing (Eval Engineer)

TalentOla


Job Location:

Fort Worth, TX - USA

Monthly Salary: Not Disclosed
Posted on: 10 days ago
Vacancies: 1 Vacancy

Job Summary

AI Eval / Testing (Eval Engineer)

Client : Alcon

Fort Worth TX Hybrid is preferred but remote

Job Title: Sr AI Engineer

Experience Level: 6-10 years

Job Summary

We are seeking a skilled AI Engineer to build integrate and operationalize AI/ML models and agent workflows on AWS and Azure as the core AI foundation with Microsoft Copilot as the primary user experience layer. The role involves collaborating with AI Architects and data teams to deploy scalable production-grade AI solutions that are grounded in enterprise data governed responsibly and optimized for real-world performance. The candidate should be able to own the full AI engineering lifecycle from prototyping and integration through to production deployment and ongoing optimization.

Key Responsibilities

LLM & Agent Development

  • Build integrate and iterate on LLM-powered agent experiences for enterprise knowledge access and workflow automation.
  • Own prompt engineering orchestration logic and multi-agent workflow design using AWS Bedrock and Azure AI services.
  • Implement grounding citation enforcement and refusal behavior patterns aligned with enterprise governance standards.
  • Build structured triage and escalation logic within agent workflows to support robust production-grade AI systems.
  • Own the AI engineering layer end to end from prototype through pilot validation and production deployment.

RAG & Retrieval Engineering

  • Implement RAG pipelines using structured and unstructured enterprise data on AWS and Azure cloud-native services.
  • Tune retrieval quality through vector search re-ranking strategies and context window optimization.
  • Work with embedding models chunking strategies and hybrid retrieval approaches to improve answer relevance.
  • Integrate vector databases such as Azure AI Search and Amazon OpenSearch to support enterprise RAG systems.

Evaluation & Quality Assurance

  • Define and run evaluation frameworks to measure answer accuracy hallucination rates and response quality.
  • Ensure relevance freshness observability and security of AI outputs across production environments.
  • Implement monitoring and drift detection for deployed AI models and LLM-based workflows.

Cloud & Platform Integration

  • Build and deploy AI solutions on AWS (Bedrock SageMaker Lambda) and Azure (Azure AI Foundry Azure ML Azure Functions) as the primary cloud platforms.
  • Integrate Microsoft Copilot as the enterprise user experience layer connecting AI capabilities to end-user workflows.
  • Integrate AI components with enterprise applications APIs and data sources to enable scalable end-to-end AI workflows.
  • Ensure compliance with security privacy and responsible AI guidelines across all AI deployments.

Collaboration & Delivery

  • Collaborate with AI Architects and data teams to translate architectural designs into production-ready AI implementations.
  • Work within cross-functional delivery teams to align AI solutions with business and product requirements.
  • Support POCs and pilot programmes demonstrating the value and feasibility of AI solutions before full-scale deployment.

Required Qualifications

  • 6-10 years of overall experience with 3 years building LLM-powered applications in production or near-production environments.
  • Hands-on experience with AWS AI/ML services (Bedrock SageMaker) and Azure AI services (Azure AI Foundry Azure OpenAI Service Azure ML).
  • Experience integrating with Microsoft Copilot or building Copilot extensibility solutions (plugins connectors or agents).
  • Hands-on experience with RAG architectures: vector search embedding models chunking strategies and hybrid retrieval.
  • Strong understanding of grounding techniques hallucination mitigation and AI evaluation methodologies.
  • Experience with agent orchestration frameworks and patterns: multi-agent routing workflow chaining and context management.
  • Strong Python skills; familiarity with LangChain Semantic Kernel or equivalent agent orchestration frameworks preferred.
  • Ability to work autonomously and own the full AI engineering stack within a cross-functional delivery team.
AI Eval / Testing (Eval Engineer) Client : Alcon Fort Worth TX Hybrid is preferred but remote Job Title: Sr AI Engineer Experience Level: 6-10 years Job Summary We are seeking a skilled AI Engineer to build integrate and operationalize AI/ML models and agent workflows on AWS and Azure as...