AI Automation QA
Job Location:
Taguig - Philippines
Monthly Salary:
Not Disclosed
Posted on:
14 days ago
Vacancies:
1 Vacancy
Job Summary
Description
Responsibilities
Qualifications
Key Accountabilities
- Definition and execution of testing and quality assurance strategies for AIenabled workflows
- Continuous evaluation and monitoring of system behavior in production environments
- Contribution to auditability risk management and continuous quality improvement
Principal Responsibilities
- Define quality criteria and testing strategies for agent workflows covering accuracy latency safety compliance and operational risk
- Build automated evaluation harnesses to assess agent performance including hallucination rates tool misuse policy violations and task success
- Implement continuous production monitoring to detect anomalies quality degradation and emerging safety concerns
- Develop and maintain automated test suites using Playwright for UI testing and custom scripts for API and workflow validation
- Apply LLM evaluation frameworks to assess output quality regression and system drift over time
- Produce and maintain dashboards and reports that communicate quality metrics and trends to engineering and stakeholders
- Develop and maintain runbooks for common failure modes and contribute to incident response activities
- Collaborate closely with developers to improve prompts tool definitions and workflow designs based on test results
- Ensure testing logging and monitoring practices align with data privacy audit and regulatory requirements
Responsibilities
-
Qualifications
Knowledge Skills & Experience
Essential
- Minimum 3 years experience in QA test automation or DevOps roles (or 2 years with direct experience testing AI or MLenabled systems)
- Strong Python skills for test automation evaluation harnesses and basic data analysis
- High attention to detail with a focus on issues that materially impact reliability and user trust
- Comfort working with evolving tools frameworks and testing practices
- Collaborative mindset using evidencebased insights to influence product and engineering decisions
Technical Skills (Required)
- Programming: Python (test automation evaluation harnesses data analysis)
- UI Automation: Playwright (endtoend workflow testing)
- AI Evaluation: Deepeval RAGAS (LLM quality drift and regression analysis)
- Workflow Testing: API and agent workflow validation using custom scripts
- Monitoring: Production quality monitoring and anomaly detection
Desirable
- Pytest or equivalent testing frameworks
- SQL for querying logs metrics or evaluation datasets
- Prometheus Grafana or similar monitoring tools
- Familiarity with hallucination detection and AI safety patterns
- CI/CD pipelines and Gitbased workflows
WTW is an Equal Opportunity Employer