Research Engineer, Post-Training

Harvey

Job Location:

San Francisco, CA - USA

Monthly Salary: $ 231 - 340

Posted on: 2 days ago

Vacancies: 1 Vacancy

Department:

Engineering

Job Summary

Why Harvey

At Harvey were transforming how legal and professional services operate. By combining frontier agentic AI an enterprise-grade platform and deep domain expertise were reshaping how critical knowledge work gets done for decades to come.

This is a rare chance to help build a generational company at a true inflection point. With 1500 customers in 60 countries strong product-market fit and world-class investor support were scaling fast and defining a new category in real time. The work is ambitious the bar is high and the opportunity for growth personal professional and financial is unmatched.

Our team moves fast takes ownership and is deeply committed to the mission operating with intensity staying close to our customers and pushing each other for excellence. We live by three values: Decisiveness Simplicity and Jobs Not Finished. We act quickly on clear judgment over perfect information we believe simplicity is what scales and were never satisfied with where we are. If you want to do the best work of your career alongside people who share that drive wed love to build with you.

At Harvey the future of professional services is being written today and were just getting started.

Role Overview

Post-training is how Harvey turns expert feedback and agent traces into models that are meaningfully better at legal work. We are looking for a research engineer who can help scale that loop: defining and running model training experiments interpreting results and working with internal and external research partners to build better data environments graders and training recipes.

This role is for someone who can self-manage model training and applied research projects. You will work closely with internal and external research collaborators on post-training efforts that matter to our product roadmap. The ideal candidate has extensive hands-on experience training open weight models either in a research or production setting and enough engineering depth to run and debug experiments efficiently.

What Youll Do

Drive post-training experiments pushing agent performance while navigating the Pareto frontier of cost latency security and governance.
Optimize agent harnesses including domain-specific skills tools subagents retrieval strategies and validation loops that improve quality on long-horizon legal work.
Design and develop grading and reward systems that are reliable enough for evaluation efficient enough for iteration and strict enough for high-stakes legal work.
Study agent behavior identifying patterns that correlate with successful work product and converting those findings into training data evals or harness changes.
Work with Harvey researchers and external research partners to define experiments evaluate methodology review results and keep projects moving toward concrete model improvements.

What You Have

Hands-on experience with post-training or model-training work such as SFT preference optimization RLHF/RLAIF reward modeling distillation or adapting open-weight models to specialized domains.
Strong judgment about model behavior: you can read traces inspect outputs identify failure modes and reason about whether a metric is measuring the thing that matters.
Strong Python and research-engineering ability. You can write clean code debug experiments and build the simple but reliable systems needed to make research move faster.
Ability to self-manage ambiguous applied research projects and communicate clearly with researchers engineers product teams domain experts and external partners.

Nice to Have

Experience building data or evaluation infrastructure for ML workflows such as dataset curation pipelines model-output processing experiment tracking evaluation dashboards or regression analysis tooling.
Experience with distributed training inference systems GPU workloads or large-scale ML experimentation.
Research publications open-source contributions or shipped industry work in LLMs agents evaluation or ML systems.

Compensation

$231000 - $340000

Depending on your location an Applicant Privacy Notice may apply to you. You can find all of our Applicant Privacy Notices here.

#LI-AK1

Harvey is an equal opportunity employer and does not discriminate on the basis of race gender sexual orientation gender identity/expression national origin disability age genetic information veteran status marital status pregnancy or related condition or any other basis protected by law.

We are committed to providing reasonable accommodations to applicants with disabilities and requests can be made by emailing

Required Experience:

Why HarveyAt Harvey were transforming how legal and professional services operate. By combining frontier agentic AI an enterprise-grade platform and deep domain expertise were reshaping how critical knowledge work gets done for decades to come.This is a rare chance to help build a generational compa...

Why Harvey

At Harvey the future of professional services is being written today and were just getting started.

Role Overview

What Youll Do

Drive post-training experiments pushing agent performance while navigating the Pareto frontier of cost latency security and governance.
Optimize agent harnesses including domain-specific skills tools subagents retrieval strategies and validation loops that improve quality on long-horizon legal work.
Design and develop grading and reward systems that are reliable enough for evaluation efficient enough for iteration and strict enough for high-stakes legal work.
Study agent behavior identifying patterns that correlate with successful work product and converting those findings into training data evals or harness changes.
Work with Harvey researchers and external research partners to define experiments evaluate methodology review results and keep projects moving toward concrete model improvements.

What You Have

Hands-on experience with post-training or model-training work such as SFT preference optimization RLHF/RLAIF reward modeling distillation or adapting open-weight models to specialized domains.
Strong judgment about model behavior: you can read traces inspect outputs identify failure modes and reason about whether a metric is measuring the thing that matters.
Strong Python and research-engineering ability. You can write clean code debug experiments and build the simple but reliable systems needed to make research move faster.
Ability to self-manage ambiguous applied research projects and communicate clearly with researchers engineers product teams domain experts and external partners.

Nice to Have

Experience building data or evaluation infrastructure for ML workflows such as dataset curation pipelines model-output processing experiment tracking evaluation dashboards or regression analysis tooling.
Experience with distributed training inference systems GPU workloads or large-scale ML experimentation.
Research publications open-source contributions or shipped industry work in LLMs agents evaluation or ML systems.

Compensation

$231000 - $340000

Depending on your location an Applicant Privacy Notice may apply to you. You can find all of our Applicant Privacy Notices here.

#LI-AK1

We are committed to providing reasonable accommodations to applicants with disabilities and requests can be made by emailing

Required Experience:

Apply Now

About Company

Harvey

Professional Class AI – Harvey is the platform built to meet the standards of the world’s leading professional service firms.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Research Engineer, Post-Training

San Francisco, CA - USA

Department:

Job Summary

Why Harvey

Role Overview

What Youll Do

What You Have

Compensation

Depending on your location an Applicant Privacy Notice may apply to you. You can find all of our Applicant Privacy Notices here.

Why Harvey

Role Overview

What Youll Do

What You Have

Compensation

Depending on your location an Applicant Privacy Notice may apply to you. You can find all of our Applicant Privacy Notices here.

About Company

Related Jobs