Research Scientist, Generative Worlds

New York City, NY - USA

Monthly Salary: Not Disclosed

Posted on: 03-11-2025

Vacancies: 1 Vacancy

Job Summary

Snapshot

Help us build generative models of the 3D world. World models power numerous domains such as media generation visual reasoning simulation planning for embodied agents and real-time interactive experiences. Work with us to build better versions of Gemini Genie and Veo while also exploring new spatial modalities beyond images and videos.

The Role

Key responsibilities: Conduct research to build generative multimodal models of the 3D world. Solve essential problems to train world models at massive scale: build and train large-scale systems for data annotation curate and annotate training datasets build and maintain large model training infrastructure develop scaling ladders and training recipes develop metrics for spatial intelligence enable real-time interactive experiences study the integration of spatial modalities with multimodal language models and of course: actually train massive-scale models.

Areas of focus:

3D computer vision spatial annotation systems
Spatial representations
Training large-scale transformers
Generative pixel and latent models
Infrastructure for large-scale data pipelines and annotation.
Quantitative evals for spatial accuracy and intelligence.
Model scaling efficiency distillation training infrastructure

About you

We seek individuals who are passionate about large-scale generative models and believe spatial understanding and generation are on the path to intelligence. We strive for simple methods that scale and look for candidates excited to improve models through infrastructure data evals and compute.

In order to set you up for success as a Research Scientist/Engineer at Google DeepMind we look for the following skills and experience:

MSc or PhD in computer science or machine learning or equivalent industry experience.
Experience with large-scale transformer models and/or large-scale data pipelines.
Track record of releases publications and/or open source projects relating to video generation world models multimodal language models or transformer architectures.
Exceptional engineering skills in Python and deep learning frameworks (e.g. Jax TensorFlow PyTorch) with a track record of building high-quality research prototypes and systems.
Demonstrated experience in large-scale training of multimodal generative models.

In addition the following would be an advantage:

Experience building training codebases for large-scale video or multimodal transformers.
Expertise optimizing efficiency of distributed training systems and/or inference systems.
Strong background in 3D representations or 3D computer vision
Strong publication record at top-tier machine learning computer vision and graphics conferences (e.g. NeurIPS ICLR ICML SIGGRAPH CVPR ICCV).
A keen eye for visual aesthetics and detail coupled with a passion for creating high-quality visually compelling generative content.

SnapshotHelp us build generative models of the 3D world. World models power numerous domains such as media generation visual reasoning simulation planning for embodied agents and real-time interactive experiences. Work with us to build better versions of Gemini Genie and Veo while also exploring new...

Snapshot

The Role

Areas of focus:

3D computer vision spatial annotation systems
Spatial representations
Training large-scale transformers
Generative pixel and latent models
Infrastructure for large-scale data pipelines and annotation.
Quantitative evals for spatial accuracy and intelligence.
Model scaling efficiency distillation training infrastructure

About you

In order to set you up for success as a Research Scientist/Engineer at Google DeepMind we look for the following skills and experience:

MSc or PhD in computer science or machine learning or equivalent industry experience.
Experience with large-scale transformer models and/or large-scale data pipelines.
Track record of releases publications and/or open source projects relating to video generation world models multimodal language models or transformer architectures.
Exceptional engineering skills in Python and deep learning frameworks (e.g. Jax TensorFlow PyTorch) with a track record of building high-quality research prototypes and systems.
Demonstrated experience in large-scale training of multimodal generative models.

In addition the following would be an advantage:

Experience building training codebases for large-scale video or multimodal transformers.
Expertise optimizing efficiency of distributed training systems and/or inference systems.
Strong background in 3D representations or 3D computer vision
Strong publication record at top-tier machine learning computer vision and graphics conferences (e.g. NeurIPS ICLR ICML SIGGRAPH CVPR ICCV).
A keen eye for visual aesthetics and detail coupled with a passion for creating high-quality visually compelling generative content.

Key Skills

Laboratory Experience
Machine Learning
Python
AI
Bioinformatics
C/C++
R
Biochemistry
Research Experience
Natural Language Processing
Deep Learning
Molecular Biology

Apply Now

About Company

DeepMind

Artificial intelligence could be one of humanity’s most useful inventions. We research and build safe artificial intelligence systems. We're committed to solving intelligence, to advance science and benefit humanity.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Research Scientist, Generative Worlds

New York City, NY - USA

Job Summary

Snapshot

The Role

About you

Snapshot

The Role

About you

Key Skills

About Company

Related Jobs