Project Manager R

Bangalore - India

Monthly Salary: Not Disclosed

Posted on: 17 hours ago

Vacancies: 1 Vacancy

Job Summary

Project Manager

Primary Skills

Project Delivery Management Manage Fixed Price Delivery Estimations and Metrics Client Management Communications Management Agile Agile Metrics and Reporting Professional Scrum Master-CSM/ PSM1 Manage Outcome Based Delivery Requirements Creation/User Stories Digital Accumen Agile Coaching Change Management (Project Management) Backlog Grooming

Job requirements

Job Title: AI Agent Evaluation Engineer JD: We are seeking a highly motivated and technically proficient AI Agent Evaluation Engineer to join our growing AI team. This crucial role will be responsible for defining developing and executing robust Agent evaluation frameworks and test strategies with a significant focus on Responsible AI and Safety Evals for our agents built using the Google Agent Development Kit (ADK). The ideal candidate will bridge the gap between AI development and reliable deployment ensuring our agents are safe ethical effective and meet high-quality performance standards. The role will be of 70% Automation and 30% Manual Testing Key Responsibilities Evaluation (Evals) Development: Develop synthetic testing environments and simulation strategies to stress-test agents under various real-world conditions. Design implement and maintain scalable and repeatable evaluation datasets and metrics to test agent performance robustness safety and alignment (e.g. faithfulness hallucination prompt injection). Specifically focus on building Evals for agents utilizing the Google Agent Development Kit (ADK) and related Google AI/ML services (e.g. Vertex AI Gemini models). Responsible AI and Safety Evals (New Focus): Develop and execute adversarial testing jailbreaking and red-teaming methodologies to identify potential harm bias toxicity and unauthorized behavior in agent responses. Implement and measure adherence to established ethical guidelines safety policies and content filtering mechanisms. Work with policy and legal teams to ensure agent evaluations cover regulatory compliance and fairness objectives. Test Strategy & Execution: Define comprehensive QA strategies including functional integration regression and user acceptance testing (UAT) specifically for conversational and goal-oriented AI agents. Develop and execute detailed Test artefacts such as test planstest cases test Scenarios for agent features tool use memory and reasoning capabilities. Bug Detection & Management: Identify document prioritize and track bugs using Jira performance degradations and alignment failures in agent behavior. Collaborate closely with AI/ML Engineers and Researchers to analyze root causes and validate fixes. Automation & Tools: Integrate evaluation pipelines into the CI/CD process to enable continuous quality assurance and fast iteration cycles. Reporting & Insights: Analyze and interpret evaluation results providing clear actionable insights and quality reports to stakeholders and development teams with a specific focus on safety metrics and risk mitigation. Required Skills & Qualifications Experience: 6 years in Software QA with at least 2 years focused on testing or evaluating AI/ML systems conversational agents or Large Language Models (LLMs). Safety Evals Expertise (Mandatory): Direct experience in designing and executing safety evaluations (red teaming adversarial testing) bias detection and measuring toxicity/harmful content in generative AI models. Agent/LLM Evals: Proven experience developing and running general evaluations (Evals) for LLM-powered applications knowing libraries like PyTest (Must) Google ADK Familiarity (Mandatory): Direct or strong conceptual understanding of the Google Agent Development Kit (ADK) and its components. Programming: Strong proficiency in Python is mandatory for script development data processing and automation. Cloud & MLOps: Familiarity with Google Cloud Platform (GCP) services relevant to AI/ML (e.g. Vertex AI) and integrating testing into MLOps workflows. Tools and Libraries: Langsmith DeepEval Ragas Giskard Hugging face.

We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.

Required Experience:

Project ManagerPrimary SkillsProject Delivery Management Manage Fixed Price Delivery Estimations and Metrics Client Management Communications Management Agile Agile Metrics and Reporting Professional Scrum Master-CSM/ PSM1 Manage Outcome Based Delivery Requirements Creation/User Stories Digital Accu...

Project Manager

Primary Skills

Project Delivery Management Manage Fixed Price Delivery Estimations and Metrics Client Management Communications Management Agile Agile Metrics and Reporting Professional Scrum Master-CSM/ PSM1 Manage Outcome Based Delivery Requirements Creation/User Stories Digital Accumen Agile Coaching Change Management (Project Management) Backlog Grooming

Job requirements

Job Title: AI Agent Evaluation Engineer JD: We are seeking a highly motivated and technically proficient AI Agent Evaluation Engineer to join our growing AI team. This crucial role will be responsible for defining developing and executing robust Agent evaluation frameworks and test strategies with a significant focus on Responsible AI and Safety Evals for our agents built using the Google Agent Development Kit (ADK). The ideal candidate will bridge the gap between AI development and reliable deployment ensuring our agents are safe ethical effective and meet high-quality performance standards. The role will be of 70% Automation and 30% Manual Testing Key Responsibilities Evaluation (Evals) Development: Develop synthetic testing environments and simulation strategies to stress-test agents under various real-world conditions. Design implement and maintain scalable and repeatable evaluation datasets and metrics to test agent performance robustness safety and alignment (e.g. faithfulness hallucination prompt injection). Specifically focus on building Evals for agents utilizing the Google Agent Development Kit (ADK) and related Google AI/ML services (e.g. Vertex AI Gemini models). Responsible AI and Safety Evals (New Focus): Develop and execute adversarial testing jailbreaking and red-teaming methodologies to identify potential harm bias toxicity and unauthorized behavior in agent responses. Implement and measure adherence to established ethical guidelines safety policies and content filtering mechanisms. Work with policy and legal teams to ensure agent evaluations cover regulatory compliance and fairness objectives. Test Strategy & Execution: Define comprehensive QA strategies including functional integration regression and user acceptance testing (UAT) specifically for conversational and goal-oriented AI agents. Develop and execute detailed Test artefacts such as test planstest cases test Scenarios for agent features tool use memory and reasoning capabilities. Bug Detection & Management: Identify document prioritize and track bugs using Jira performance degradations and alignment failures in agent behavior. Collaborate closely with AI/ML Engineers and Researchers to analyze root causes and validate fixes. Automation & Tools: Integrate evaluation pipelines into the CI/CD process to enable continuous quality assurance and fast iteration cycles. Reporting & Insights: Analyze and interpret evaluation results providing clear actionable insights and quality reports to stakeholders and development teams with a specific focus on safety metrics and risk mitigation. Required Skills & Qualifications Experience: 6 years in Software QA with at least 2 years focused on testing or evaluating AI/ML systems conversational agents or Large Language Models (LLMs). Safety Evals Expertise (Mandatory): Direct experience in designing and executing safety evaluations (red teaming adversarial testing) bias detection and measuring toxicity/harmful content in generative AI models. Agent/LLM Evals: Proven experience developing and running general evaluations (Evals) for LLM-powered applications knowing libraries like PyTest (Must) Google ADK Familiarity (Mandatory): Direct or strong conceptual understanding of the Google Agent Development Kit (ADK) and its components. Programming: Strong proficiency in Python is mandatory for script development data processing and automation. Cloud & MLOps: Familiarity with Google Cloud Platform (GCP) services relevant to AI/ML (e.g. Vertex AI) and integrating testing into MLOps workflows. Tools and Libraries: Langsmith DeepEval Ragas Giskard Hugging face.

Required Experience:

Key Skills

Project Management Methodology
Project / Program Management
Construction Estimating
Construction Experience
PMBOK
Visio
Construction Management
Project Management
Project Management Software
Microsoft Project
Project Management Lifecycle
Contracts

Apply Now

About Company

Brillio

Brillio is a global leader in Enterprise Digital Transformation Solutions, providing strategic consulting services and solutions using emerging technologies.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click