Job Description:
The Senior AI Test Engineer is responsible for validating the quality reliability safety performance and governance of AI systems including LLMs RAG pipelines and AI agents. Unlike traditional QA role this position focuses on LLM evaluation behavioral testing hallucination bias detection AI test automation and alignment with Responsible AI governance standards.
Key Responsibilities:
Define AI test strategy for LLMs AI agents and orchestration workflows
Design and maintain golden evaluation datasets prompt test cases and benchmarking frameworks for LLMs RAG systems
Perform regression testing across prompt updates agent logic changes multi-step agent workflows and model upgrades
Measure and analyze accuracy consistency latency throughput and cost efficiency of AI services
Integrate AI test suites into CI/CD pipelines with automated quality gates
Conduct red-team testing to identify safety compliance security and prompt-injection vulnerabilities
Provide actionable insights through evaluation reports metrics and release readiness assessments
Qualifications:
78 years of experience in AI Test Automation
Proven experience in testing AI/ML/LLM-based systems
Strong understanding of:
Prompt behavior model evaluation dataset curation and risk management
Hands-on experience with:
Python Pytest Java
Test automation tools (Selenium Playwright)
AI evaluation tools and frameworks
Azure AI Services / Azure OpenAI CI/CD pipelines and GitHub Actions
Test automation integration into DevOps workflows
Required Experience:
Senior IC
Job Description:The Senior AI Test Engineer is responsible for validating the quality reliability safety performance and governance of AI systems including LLMs RAG pipelines and AI agents. Unlike traditional QA role this position focuses on LLM evaluation behavioral testing hallucination bias detec...
Job Description:
The Senior AI Test Engineer is responsible for validating the quality reliability safety performance and governance of AI systems including LLMs RAG pipelines and AI agents. Unlike traditional QA role this position focuses on LLM evaluation behavioral testing hallucination bias detection AI test automation and alignment with Responsible AI governance standards.
Key Responsibilities:
Define AI test strategy for LLMs AI agents and orchestration workflows
Design and maintain golden evaluation datasets prompt test cases and benchmarking frameworks for LLMs RAG systems
Perform regression testing across prompt updates agent logic changes multi-step agent workflows and model upgrades
Measure and analyze accuracy consistency latency throughput and cost efficiency of AI services
Integrate AI test suites into CI/CD pipelines with automated quality gates
Conduct red-team testing to identify safety compliance security and prompt-injection vulnerabilities
Provide actionable insights through evaluation reports metrics and release readiness assessments
Qualifications:
78 years of experience in AI Test Automation
Proven experience in testing AI/ML/LLM-based systems
Strong understanding of:
Prompt behavior model evaluation dataset curation and risk management
Hands-on experience with:
Python Pytest Java
Test automation tools (Selenium Playwright)
AI evaluation tools and frameworks
Azure AI Services / Azure OpenAI CI/CD pipelines and GitHub Actions
Test automation integration into DevOps workflows
Required Experience:
Senior IC
View more
View less