Senior Product Manager Tech, GenAI, Amazon Rufus

Amazon

Not Interested
Bookmark
Report This Job

profile Job Location:

London - UK

profile Monthly Salary: Not Disclosed
Posted on: 9 hours ago
Vacancies: 1 Vacancy

Job Summary

Amazons Rufus AI team is building the future of conversational shopping. Rufus helps hundreds of millions of customers find and discover products through natural language and behind every response is an automated quality measurement system powered by LLM-as-a-Judge (LLMAJ) technology. We are seeking a Sr. Product Manager-Tech to own the quality governance global scaling and operational excellence of this judge portfolio.

You will work alongside Language Engineers who build and tune judges Product Managers who define quality criteria and evaluation standards Data Scientists who operate evaluation pipelines and Engineering teams who build the infrastructure that runs evaluations. This is a high-autonomy role: you own your domain end-to-end and are expected to drive decisions not just track workstreams.

This role sits at the intersection of AI evaluation product management and applied tooling. You will own the governance framework for a portfolio of dozens of LLM judges that power critical evaluation metrics used for release decisions competitive benchmarking and leadership reporting. You will drive the localization of judges from en-US to 5 international marketplaces facilitate model evaluation and debugging workflows and build purpose-built tools and agents to automate governance operations at scale.

Key job responsibilities
- Own the LLMAJ governance framework: judge registry versioning standards quality validation gates deprecation policies and agreement rate monitoring across the full judge portfolio
- Own the international LLMAJ expansion: drive judge localization from en-US to global marketplaces identify coverage gaps define remediation plans and validate judge quality per locale
- Facilitate model evaluation and debugging: work with Language Engineers and Scientists to trace response quality issues inspect production logs and root-cause judge disagreements or quality regressions
- Build purpose-built tools and agents: code automation using internal agent frameworks to streamline governance workflows judge monitoring data extraction and reporting
- Define and own partner-facing quality metrics powered by LLMAJ including defect rates agreement rates and evaluation dimension reporting across partner teams
- Drive human-in-the-loop validation workflows coordinating between evaluation platforms and annotation teams to maintain judge calibration
- Drive discipline on evaluation requests by enforcing data-driven problem statements clear scoping and definition of done before work begins
- Write business requirements documents contribute to leadership updates and represent LLMAJ governance in cross-functional forums

A day in the life
You start the morning checking agreement rate dashboards for drift across international locales and triaging alerts. A new prompt release is shipping so you pull evaluation results spot two judges regressing in the Japanese marketplace and open a debugging session with a Language Engineer to trace the root cause. After lunch you present international judge coverage in a cross-functional the afternoon you ship an update to a governance agent you built that auto-generates weekly judge health reports. You close the day pushing back on an under-scoped evaluation request.

About the team
We are the team responsible for measuring whether Amazons AI shopping assistant is actually good. We build LLM judges define quality standards and run evaluations that directly inform what ships to hundreds of millions of customers. Our team includes Language Engineers Data Scientists and Product Managers who work closely with Science Engineering and Product teams across the organization. We move fast care deeply about measurement rigor and believe that if you cannot measure quality automatically you cannot improve it at scale.

- Bachelors degree
- Experience in technical product management program management or engineering
- Experience owning/driving roadmap strategy and definition
- Experience with end to end product delivery
- Experience with feature delivery and tradeoffs of a product
- Experience contributing to engineering discussions around technology decisions and strategy related to a product
- Experience in representing and advocating for a variety of critical customers and stakeholders during executive-level prioritization and planning

- Experience in using analytical tools such as Tableau Qlikview QuickSight
- Experience in building and driving adoption of new tools

Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover invent simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice ( to know more about how we collect use and transfer the personal data of our candidates.

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit
for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.


Required Experience:

Senior IC

Amazons Rufus AI team is building the future of conversational shopping. Rufus helps hundreds of millions of customers find and discover products through natural language and behind every response is an automated quality measurement system powered by LLM-as-a-Judge (LLMAJ) technology. We are seeking...
View more view more

About Company

Company Logo

Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa Devices, sporting goods, toys, automotive ... View more

View Profile View Profile