Senior AI Testing Engineer (Generative AI)

Wadhwani Foundation

Job Location:

Bengaluru - India

Monthly Salary: Not Disclosed

Experience Required: 05-07years

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Senior AI Testing Engineer (Generative AI)

Location India Remote / Hybrid / In-office specify your actual working model here

Experience 58 years total experience in software testing QA engineering or SDET roles with at least 23 years of meaningful hands-on exposure to Generative AI systems LLM applications or AI quality engineering.

Role Overview

We are looking for a Senior AI Testing Engineer to own quality across our Generative AI products and platform.

This role is fundamentally about engineering quality into AI systems not running test scripts. Youll design evaluation frameworks build automated testing pipelines and define what "good" looks like for LLM outputs RAG systems AI agents and voice AI applications. Youll work directly with AI engineers and product teams to make sure our systems are reliable safe and measurably improving over time.

If you understand how LLMs fail know how to catch hallucinations before users do and want to build the quality infrastructure that underpins production AI at scale this is the role.

Key Responsibilities

Evaluation Strategy & Frameworks

Design and own comprehensive testing strategies for Generative AI products including LLM applications RAG pipelines AI agents voice AI systems and workflow automation

Define evaluation methodologies covering functional testing response quality hallucination detection safety and guardrail testing prompt injection bias and toxicity retrieval quality latency benchmarking and agent workflow validation

Build reusable AI testing frameworks and automation pipelines for continuous evaluation

Create datasets benchmark suites and golden test sets for GenAI evaluation

Automated Evaluation

Develop automated evaluation pipelines using LLM-as-a-Judge and hybrid evaluation methods

Implement CI/CD-integrated AI evaluation pipelines

Drive observability and monitoring strategies for production AI systems

Quality Standards & Collaboration

Define measurable quality KPIs for AI systems

Establish testing standards best practices and governance processes for GenAI applications

Work closely with AI engineers product and platform teams to embed quality throughout the development lifecycle

Required Skills & Experience

Testing & Engineering Experience

58 years in software testing QA engineering SDET or test automation

23 years of hands-on experience testing or evaluating production-grade Generative AI or LLM-based systems

Strong test automation skills in Python

Experience designing scalable automated testing frameworks

Familiarity with API testing integration testing and performance testing

Generative AI Knowledge

Solid understanding of how LLM systems work and how they fail

Experience with RAG architectures prompt engineering AI agents embedding models and vector databases

Understanding of LLM evaluation methodologies and AI system failure modes

GenAI Testing Frameworks

Hands-on experience with at least one or more GenAI evaluation frameworks such as: DeepEval Ragas LangSmith Promptfoo TruLens OpenAI Evals or LangChain evaluation tools

Quality Engineering

Expertise in test strategy test planning test automation architecture defect lifecycle management and quality metrics

Ability to define and track measurable quality KPIs for AI systems

Preferred Qualifications

Experience with cloud platforms (AWS Azure or GCP)

Familiarity with MLOps / LLMOps workflows

Experience with CI/CD pipelines and DevOps practices

Exposure to monitoring and observability tooling for AI systems

Understanding of security and compliance for GenAI products

Experience with conversational AI or voice AI systems

Required Skills:

Senior AI Testing Engineer (Generative AI) Location India Remote / Hybrid / In-office specify your actual working model here Experience 58 years total experience in software testing QA engineering or SDET roles with at least 23 years of meaningful hands-on exposure to Generative AI systems LLM applications or AI quality engineering. Role Overview We are looking for a Senior AI Testing Engineer to own quality across our Generative AI products and platform. This role is fundamentally about engineering quality into AI systems not running test scripts. Youll design evaluation frameworks build automated testing pipelines and define what good looks like for LLM outputs RAG systems AI agents and voice AI applications. Youll work directly with AI engineers and product teams to make sure our systems are reliable safe and measurably improving over time. If you understand how LLMs fail know how to catch hallucinations before users do and want to build the quality infrastructure that underpins production AI at scale this is the role. Key Responsibilities Evaluation Strategy & Frameworks Design and own comprehensive testing strategies for Generative AI products including LLM applications RAG pipelines AI agents voice AI systems and workflow automation Define evaluation methodologies covering functional testing response quality hallucination detection safety and guardrail testing prompt injection bias and toxicity retrieval quality latency benchmarking and agent workflow validation Build reusable AI testing frameworks and automation pipelines for continuous evaluation Create datasets benchmark suites and golden test sets for GenAI evaluation Automated Evaluation Develop automated evaluation pipelines using LLM-as-a-Judge and hybrid evaluation methods Implement CI/CD-integrated AI evaluation pipelines Drive observability and monitoring strategies for production AI systems Quality Standards & Collaboration Define measurable quality KPIs for AI systems Establish testing standards best practices and governance processes for GenAI applications Work closely with AI engineers product and platform teams to embed quality throughout the development lifecycle Required Skills & Experience Testing & Engineering Experience 58 years in software testing QA engineering SDET or test automation 23 years of hands-on experience testing or evaluating production-grade Generative AI or LLM-based systems Strong test automation skills in Python Experience designing scalable automated testing frameworks Familiarity with API testing integration testing and performance testing Generative AI Knowledge Solid understanding of how LLM systems work and how they fail Experience with RAG architectures prompt engineering AI agents embedding models and vector databases Understanding of LLM evaluation methodologies and AI system failure modes GenAI Testing Frameworks Hands-on experience with at least one or more GenAI evaluation frameworks such as: DeepEval Ragas LangSmith Promptfoo TruLens OpenAI Evals or LangChain evaluation tools Quality Engineering Expertise in test strategy test planning test automation architecture defect lifecycle management and quality metrics Ability to define and track measurable quality KPIs for AI systems Preferred Qualifications Experience with cloud platforms (AWS Azure or GCP) Familiarity with MLOps / LLMOps workflows Experience with CI/CD pipelines and DevOps practices Exposure to monitoring and observability tooling for AI systems Understanding of security and compliance for GenAI products Experience with conversational AI or voice AI systems

Required Education:

MBA