Research Scientist, Science of Post-Training and Reinforcement Learning

DeepMind

Job Location:

London - UK

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Snapshot

We are starting a small team aimed at building a real science of post-training for agents. This involves reinforcement learning for LLM-based systems rigorous experimentation and a focus on scaling evaluation and the practical details that make methods work.

This Research Scientist role is intentionally hands-on. The core loop is: form a hypothesis implement it run strong experiments analyze what happened and decide what to do next. We care about research that holds up over time not just incremental wins.

About Us

Artificial Intelligence could be one of humanitys most useful inventions. At Google DeepMind were a team of scientists engineers machine learning experts and more working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery and collaborate with others on critical challenges ensuring safety and ethics are the highest priority.

The Role

You will work closely with Ian Osband and the team on research around post-training for agents and LLMs including practical RL methods and evaluation. This is not a theory-only role; you should expect to implement code run experiments and own results end-to-end. Success in this role is defined by whether the team learns faster and whether the work produced is crisp honest and high-quality.

Key Responsibilities

Propose and test research hypotheses in post-training and RL for agents/LLMs.
Implement algorithm ideas and run end-to-end experiments including setup execution analysis and iteration.
Design evaluations and ablations that answer real questions and change minds.
Analyze results carefully including debugging and failure analysis.
Communicate clearly through plots writeups and paper-ready narratives and figures.
Collaborate closely with engineering and research partners to keep the team aligned on findings and strategy.
Contribute to a culture of first-principles thinking high standards and direct constructive feedback.

About You

In order to set you up for success as a Research Scientist at Google DeepMind we look for the following skills and experience:

A research track record in ML/RL demonstrated through publications or high-quality projects.
Strong implementation ability and comfort working in research codebases.
Evidence of owning experiments end-to-end including analysis and interpretation.
Strong communication skills and a bias toward clarity and honesty regarding results.
High agency and drive: You push projects forward prioritize effectively and take initiative.
PhD in ML preferred or equivalent practical experience.

In addition the following would be an advantage:

Experience with RL for sequence models post-training preference-based learning or agentic systems.
Experience with modern research stacks (e.g. JAX/Flax or PyTorch) and scaling experiments.
Strong experimental taste: Good judgment regarding baselines ablations and what is worth testing.
Comfort with scaling evaluation methodologies and diagnosing complex failure modes.
A focus on craft: You care about doing excellent work while maintaining a high velocity.

At Google DeepMind we value diversity of experience knowledge backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex race religion or belief ethnic or national origin disability age citizenship marital domestic or civil partnership status sexual orientation gender identity pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation please do not hesitate to let us know.

Note: In the event your application is successful and an offer of employment is made to you any offer of employment will be conditional on the results of a background check performed by a third party acting on our behalf. For more information on how we handle your data please see our Applicant and Candidate Privacy Policy.

Closing date: Tuesday 17th March at 5:00pm GMT

Required Experience:

SnapshotWe are starting a small team aimed at building a real science of post-training for agents. This involves reinforcement learning for LLM-based systems rigorous experimentation and a focus on scaling evaluation and the practical details that make methods work.This Research Scientist role is in...

Snapshot

About Us

The Role

Key Responsibilities

Propose and test research hypotheses in post-training and RL for agents/LLMs.
Implement algorithm ideas and run end-to-end experiments including setup execution analysis and iteration.
Design evaluations and ablations that answer real questions and change minds.
Analyze results carefully including debugging and failure analysis.
Communicate clearly through plots writeups and paper-ready narratives and figures.
Collaborate closely with engineering and research partners to keep the team aligned on findings and strategy.
Contribute to a culture of first-principles thinking high standards and direct constructive feedback.

About You

In order to set you up for success as a Research Scientist at Google DeepMind we look for the following skills and experience:

A research track record in ML/RL demonstrated through publications or high-quality projects.
Strong implementation ability and comfort working in research codebases.
Evidence of owning experiments end-to-end including analysis and interpretation.
Strong communication skills and a bias toward clarity and honesty regarding results.
High agency and drive: You push projects forward prioritize effectively and take initiative.
PhD in ML preferred or equivalent practical experience.

In addition the following would be an advantage:

Experience with RL for sequence models post-training preference-based learning or agentic systems.
Experience with modern research stacks (e.g. JAX/Flax or PyTorch) and scaling experiments.
Strong experimental taste: Good judgment regarding baselines ablations and what is worth testing.
Comfort with scaling evaluation methodologies and diagnosing complex failure modes.
A focus on craft: You care about doing excellent work while maintaining a high velocity.

Closing date: Tuesday 17th March at 5:00pm GMT

Required Experience:

Key Skills

Apply Now

About Company

DeepMind

Artificial intelligence could be one of humanity’s most useful inventions. We research and build safe artificial intelligence systems. We're committed to solving intelligence, to advance science and benefit humanity.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Research Scientist, Science of Post-Training and Reinforcement Learning

London - UK

Job Summary

Snapshot

About Us

The Role

About You

Snapshot

About Us

The Role

About You

Key Skills

About Company

Related Jobs