Functional AI Tester GenAI

Michelin

Not Interested
Bookmark
Report This Job

profile Job Location:

Pune - India

profile Monthly Salary: Not Disclosed
Posted on: 11 hours ago
Vacancies: 1 Vacancy

Job Summary

Functional AI Tester - GenAI

About the Role

You will be involved in QA for GenAI features including Retrieval-Augmented Generation (RAG) conversational AI and Agentic evaluations. The role centers on:

  • Systematic GenAI evaluation (qualitative and quantitative metrics)

  • ETL and data quality testing for the data flows that feed AI systems

  • Python-driven automated testing

This position is hands-on and collaborative partnering with AI engineers data engineers and product teams to define measurable acceptance criteria and ship high-quality AI features.

Key Responsibilities

  • Test strategy and planning

    • Define risk-based test strategies and detailed test plans for GenAI features.

    • Establish clear acceptance criteria with stakeholders for functional safety and data quality aspects.

  • Python test automation

    • Build and maintain automated test suites using Python (e.g. PyTest requests).

    • Implement reusable utilities for prompt/response validation dataset management and result scoring.

    • Create regression baselines and golden test sets to detect quality drift.

  • GenAI evaluation

    • Develop evaluation harnesses covering factuality coherence helpfulness safety bias and toxicity etc.

    • Design prompt suites scenario-based tests and golden datasets for reproducible measurements.

    • Implement guardrail tests including prompt-injection resilience unsafe content detection and PII redaction checks.

    • Track quality metrics over time.

  • RAG and semantic retrieval testing

    • Verify alignment between retrieved sources and generated answers.

    • Verify adversarial tests.

    • Measure retrieval relevance precision/recall grounding quality and hallucination reduction.

  • API and application testing

    • Test REST endpoints supporting GenAI features (request/response contracts error handling timeouts).

  • ETL and data quality validation

    • Test ingestion and transformation logic; validate schema constraints and field-level rules.

    • Implement data profiling reconciliation between sources and targets and lineage checks.

    • Verify data privacy controls masking and retention policies across pipelines.

  • Non-functional testing

    • Performance and load testing focused on latency throughput concurrency and rate limits for LLM calls.

    • Cost-aware testing (token usage caching effectiveness) and timeout/retry behavior validation.

    • Reliability and resilience checks including error recovery and fallback behavior.

  • Share results and insights; recommend remediation and preventive actions.

Required Qualifications

  • Experience

    • 5 years in software QA including test strategy automation and defect management.

    • 2 years testing AI/ML or GenAI features with hands-on evaluation design.

    • 4 years testing ETL/data pipelines and data quality.

  • Technical skills

    • Python: Strong proficiency building automated tests and tooling (PyTest requests pydantic or similar).

    • API testing: REST contract testing schema validation negative testing.

    • GenAI evaluation: crafting prompt suites golden datasets rubric-based scoring and automated evaluation pipelines.

    • RAG testing: retrieval relevance grounding validation chunking/indexing verification and embedding checks.

    • ETL/data quality: schema and constraint validation reconciliation lineage awareness data profiling.

  • Quality and governance

    • Understanding of LLM limitations and methods to detect/reduce hallucinations.

    • Safety and compliance testing including PII handling and prompt-injection resilience.

    • Strong analytical and debugging skills across services and data flows.

  • Soft skills

    • Excellent written and verbal communication; ability to translate quality goals into measurable criteria.

    • Collaboration with AI engineers data engineers and product stakeholders.

    • Organized detail-oriented and outcomes-focused.

Nice to Have

  • Experience with evaluation frameworks or tooling for LLMs and RAG quality measurement.

  • Experience creating synthetic datasets to stress specific behaviors.

Functional AI Tester - GenAIAbout the RoleYou will be involved in QA for GenAI features including Retrieval-Augmented Generation (RAG) conversational AI and Agentic evaluations. The role centers on:Systematic GenAI evaluation (qualitative and quantitative metrics)ETL and data quality testing for the...
View more view more

Key Skills

  • Change Management
  • Civil Engineering
  • Infection Control
  • Information Technology Sales
  • Biology

About Company

Company Logo

Les pneus MICHELIN et services adaptés à votre mobilité. Trouvez le bon pneu pour votre véhicule, les conseils de nos experts, et les revendeurs en France

View Profile View Profile