Product Manager AI Evaluation & Developer Platforms

Techifide

Job Location:

São Paulo - Brazil

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

About the Opportunity

Were partnering with a fast-growing AI company building advanced enterprise-grade artificial intelligence solutions. Their products help organisations transform complex information into actionable insights through cutting-edge machine learning agentic systems and intelligent automation.

As the company continues to scale they are seeking an experienced Product Manager to lead the strategy and execution of their AI Evaluation and Developer Tooling ecosystem.

This is a highly technical role at the intersection of Product Management Machine Learning AI Quality Assurance and Developer Experience.

The Role

As Product Manager for AI Evaluation & Developer Platforms you will own the systems that measure validate and improve AI performance across the organisation.

Youll define how AI quality is assessed how model behaviour is benchmarked and how internal teams can rapidly identify issues compare results and ship improvements with confidence.

Working closely with engineering data science research and subject matter experts youll shape the tools and frameworks that enable continuous improvement across AI products and services.

This position requires someone who is comfortable discussing evaluation metrics one moment and developer workflows the next.

Key Areas of Ownership

Data Quality & Ingestion Evaluation

Develop frameworks that assess the accuracy completeness and reliability of incoming data.

Youll help identify:

Data extraction issues
Parsing and transformation errors
Schema inconsistencies
Data loss across pipelines
Quality degradation before it impacts downstream systems

Agent Performance Evaluation

Design methods for measuring how AI agents plan reason and complete multi-step tasks.

Areas of focus include:

Task completion success
Reasoning quality
Decision-making robustness
Workflow execution
Agent reliability under changing conditions

Tool Usage & Execution Assessment

Create evaluation frameworks that verify whether AI systems:

Select appropriate tools
Pass correct parameters
Interpret responses accurately
Recover from failures gracefully
Execute workflows reliably

Responsibilities

Own the roadmap for AI evaluation frameworks and internal developer tooling.
Define quality standards benchmarks scoring methodologies and success metrics.
Create detailed product requirements acceptance criteria user stories and functional specifications.
Partner closely with engineering teams throughout delivery cycles.
Build systems that enable teams to create datasets run experiments analyse results and compare model performance.
Develop workflows that incorporate expert review annotation and human feedback.
Track adoption effectiveness and business impact of evaluation tools.
Define and monitor KPIs and OKRs for quality coverage usability and platform performance.
Collaborate with design teams to deliver intuitive developer experiences and evaluation dashboards.
Present findings and recommendations to technical and executive stakeholders.
Ensure traceability from business requirements through to delivered capabilities.

Requirements

Essential

7 years of Product Management experience within highly technical environments.
Previous experience as an ML Engineer AI Engineer Applied Scientist or equivalent hands-on AI role.
Deep understanding of AI evaluation methodologies and benchmarking techniques.
Strong knowledge of Large Language Models (LLMs) AI agents retrieval systems and modern machine learning workflows.
Experience defining metrics automated testing strategies and quality assurance processes for AI products.
Familiarity with agent architectures tool calling API integrations and multi-step reasoning systems.
Experience building products for technical users such as engineers researchers analysts or data scientists.
Ability to write clear testable and actionable product requirements.
Strong stakeholder management and communication skills.
Experience working within Agile product development environments.

Desirable

Experience building evaluation frameworks for LLM RAG or agent-based applications.
Knowledge of data ingestion ETL data quality monitoring or data governance.
Experience with annotation platforms human feedback systems or labelling workflows.
Exposure to regulated or domain-specific industries such as government healthcare legal or financial services.

Why This Role

This is a rare opportunity to help define how AI quality is measured at scale.

Youll influence the evaluation standards developer tooling and decision-making processes that underpin next-generation AI systems while working alongside highly technical teams solving complex real-world challenges.

About the Opportunity Were partnering with a fast-growing AI company building advanced enterprise-grade artificial intelligence solutions. Their products help organisations transform complex information into actionable insights through cutting-edge machine learning agentic systems and intelligent au...

About the Opportunity

As the company continues to scale they are seeking an experienced Product Manager to lead the strategy and execution of their AI Evaluation and Developer Tooling ecosystem.

This is a highly technical role at the intersection of Product Management Machine Learning AI Quality Assurance and Developer Experience.

The Role

As Product Manager for AI Evaluation & Developer Platforms you will own the systems that measure validate and improve AI performance across the organisation.

Youll define how AI quality is assessed how model behaviour is benchmarked and how internal teams can rapidly identify issues compare results and ship improvements with confidence.

Working closely with engineering data science research and subject matter experts youll shape the tools and frameworks that enable continuous improvement across AI products and services.

This position requires someone who is comfortable discussing evaluation metrics one moment and developer workflows the next.

Key Areas of Ownership

Data Quality & Ingestion Evaluation

Develop frameworks that assess the accuracy completeness and reliability of incoming data.

Youll help identify:

Data extraction issues
Parsing and transformation errors
Schema inconsistencies
Data loss across pipelines
Quality degradation before it impacts downstream systems

Agent Performance Evaluation

Design methods for measuring how AI agents plan reason and complete multi-step tasks.

Areas of focus include:

Task completion success
Reasoning quality
Decision-making robustness
Workflow execution
Agent reliability under changing conditions

Tool Usage & Execution Assessment

Create evaluation frameworks that verify whether AI systems:

Select appropriate tools
Pass correct parameters
Interpret responses accurately
Recover from failures gracefully
Execute workflows reliably

Responsibilities

Own the roadmap for AI evaluation frameworks and internal developer tooling.
Define quality standards benchmarks scoring methodologies and success metrics.
Create detailed product requirements acceptance criteria user stories and functional specifications.
Partner closely with engineering teams throughout delivery cycles.
Build systems that enable teams to create datasets run experiments analyse results and compare model performance.
Develop workflows that incorporate expert review annotation and human feedback.
Track adoption effectiveness and business impact of evaluation tools.
Define and monitor KPIs and OKRs for quality coverage usability and platform performance.
Collaborate with design teams to deliver intuitive developer experiences and evaluation dashboards.
Present findings and recommendations to technical and executive stakeholders.
Ensure traceability from business requirements through to delivered capabilities.

Requirements

Essential

7 years of Product Management experience within highly technical environments.
Previous experience as an ML Engineer AI Engineer Applied Scientist or equivalent hands-on AI role.
Deep understanding of AI evaluation methodologies and benchmarking techniques.
Strong knowledge of Large Language Models (LLMs) AI agents retrieval systems and modern machine learning workflows.
Experience defining metrics automated testing strategies and quality assurance processes for AI products.
Familiarity with agent architectures tool calling API integrations and multi-step reasoning systems.
Experience building products for technical users such as engineers researchers analysts or data scientists.
Ability to write clear testable and actionable product requirements.
Strong stakeholder management and communication skills.
Experience working within Agile product development environments.

Desirable

Experience building evaluation frameworks for LLM RAG or agent-based applications.
Knowledge of data ingestion ETL data quality monitoring or data governance.
Experience with annotation platforms human feedback systems or labelling workflows.
Exposure to regulated or domain-specific industries such as government healthcare legal or financial services.

Why This Role

This is a rare opportunity to help define how AI quality is measured at scale.

Apply Now

About Company

Techifide

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Product Manager AI Evaluation & Developer Platforms

São Paulo - Brazil

Job Summary

About the Opportunity

The Role

Key Areas of Ownership

Data Quality & Ingestion Evaluation

Agent Performance Evaluation

Tool Usage & Execution Assessment

Responsibilities

Requirements

Essential

Desirable

Why This Role

About the Opportunity

The Role

Key Areas of Ownership

Data Quality & Ingestion Evaluation

Agent Performance Evaluation

Tool Usage & Execution Assessment

Responsibilities

Requirements

Essential

Desirable

Why This Role

About Company

Related Jobs