Research Scientist, Frontier, Zurich

DeepMind

Not Interested
Bookmark
Report This Job

profile Job Location:

Zürich - Switzerland

profile Monthly Salary: Not Disclosed
Posted on: 7 hours ago
Vacancies: 1 Vacancy

Job Summary

Snapshot

At Google DeepMind we foster an environment where ambitious long-term research flourishes. Our team is tackling one of the hardest problems in modern AI: Post-training Frontier models. Unlike smaller models that can rely on distillation our frontier models require novel training signals to advance the state of the art. We are defining the horizontal recipesfrom revamping RL prompts to advancing Reward Models (RM) that allow these models to think better reason deeper and align more closely with human intent. We believe that mastering the feedback loop between user signals and model behavior is the key to breaking through current performance plateaus.

About Us

Artificial Intelligence could be one of humanitys most useful inventions. At Google DeepMind were a team of scientists engineers machine learning experts and more working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery and collaborate with others on critical challenges ensuring safety and ethics are the highest priority.

The Role

We are seeking a Research Scientist or Engineer to lead the development of next-generation post-training recipes for this role you will move beyond standard tuning; you will architect the Reward Modeling and Reinforcement Learning strategies that define how our most capable models learn. You will focus specifically on hard capabilitiessuch as improving chain-of-thought reasoning and complex instruction followingwhere synthetic data and distillation fall short. You will work horizontally to ensure these recipes scale across text audio and multimodal domains establishing the gold standard for how Gemini evolves.

Key responsibilities:

  • Frontier Recipe Development: Design and validate novel post-training pipelines (SFT RLHF RLAIF) specifically for frontier-class models where no teacher model exists.
  • Advance Reward Modeling: Lead research into next-gen Reward Models including investigating new architectures reducing reward hacking and improving signal-to-noise ratios in preference data.
  • Unlock Thinking Capabilities: innovative methods to improve the models internal reasoning (chain-of-thought) focusing on correctness logic and self-correction in multi-step tasks.
  • Revamp RL Paradigms: critically re-evaluate and optimize RL prompts and feedback mechanisms to extract maximum performance from the underlying base models.
  • Solve the Flywheel Challenge: create robust mechanisms to turn user signals and interactions into training data that continuously improves the model without introducing regression or bias.

Horizontal Impact: collaborate across teams to apply these advanced recipes to various model sizes and modalities (e.g. Audio) ensuring consistent high-quality behavior.

About You

In order to set you up for success as a Research Scientist at Google DeepMind we look for the following skills and experience:

  • PhD in machine learning artificial intelligence or computer science (or equivalent practical experience).
  • Strong background in Large Language Models (LLMs) Reinforcement Learning (RL) or preference learning.
  • Research interest in aligning AI systems with human feedback and utility.
  • Familiarity with experiment design and analyzing large-scale user data.
  • Strong coding and communication skills.

Preferred requirements

  • Experience with RLHF (Reinforcement Learning from Human Feedback) or DPO (Direct Preference Optimization).
  • Experience building or improving reward models and conducting human evaluation studies.
  • A proven track record of publications in top-tier conferences (e.g. NeurIPS ICML ICLR).
  • Experience with Chain-of-Thought (CoT) reasoning research or process-based supervision.
  • Deep understanding and experience training models from scratch or using self-play/self-improvement techniques.

At Google DeepMind we value diversity of experience knowledge backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex race religion or belief ethnic or national origin disability age citizenship marital domestic or civil partnership status sexual orientation gender identity pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation please do not hesitate to let us know.


Required Experience:

IC

SnapshotAt Google DeepMind we foster an environment where ambitious long-term research flourishes. Our team is tackling one of the hardest problems in modern AI: Post-training Frontier models. Unlike smaller models that can rely on distillation our frontier models require novel training signals to a...
View more view more

Key Skills

  • Laboratory Experience
  • Machine Learning
  • Python
  • AI
  • Bioinformatics
  • C/C++
  • R
  • Biochemistry
  • Research Experience
  • Natural Language Processing
  • Deep Learning
  • Molecular Biology

About Company

Company Logo

Artificial intelligence could be one of humanity’s most useful inventions. We research and build safe artificial intelligence systems. We're committed to solving intelligence, to advance science and benefit humanity.

View Profile View Profile