Machine Learning Research Scientist, Robotics VLAs Post-Training and Adaptation

Los Altos, CA - USA

Yearly Salary: $ 176000 - 264000

Posted on: 30+ days ago

Vacancies: 1 Vacancy

The job posting is outdated and position may be filled

Job Summary

At Toyota Research Institute (TRI) were on a mission to improve the quality of human life. Were developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility weve built a world-class team advancing the state of the art in AI robotics driving and material sciences.

Overview

We are seeking a creative and technically strong researcher to advance post-training methods for Vision-Language-Action (VLA) models in robotics. This role focuses on improving model alignment robustness and adaptability in real-world robotic settings through advanced post-training and continual learning techniques. You will develop algorithms and frameworks that enable persistent learning and optimize data efficiency in embodied systems.

Responsibilities

Post-training and adaptation: Design and implement post-training pipelines for VLA models using techniques such as reinforcement learning (RL) reinforcement learning from human or preference feedback (RLHF/RLAIF) in-context learning. Experience with real-world RL is a plus!
Sim-to-real transfer: Develop methods to enhance real-world transferability of policies trained in simulation.
Reset-free and continual learning: Explore and implement reset-free and autonomous data collection strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale long term data collection.
Structured exploration: Investigate exploration algorithms that balance safety curiosity and efficiency for data gathering in both simulation and real-world robotic systems.
Data curation and feedback loops: Lead the design of data collection and curation pipelines for exploration and post-training using multimodal data from demonstrations teleoperation and on-policy rollouts.
Collaborate across teams in perception control and ML infrastructure to deploy scalable and reproducible research systems.
Publish research outcomes and contribute to the open robotics and embodied AI communities.

Qualifications

Ph.D. or M.S. in Robotics Machine Learning Computer Vision or related field or equivalent applied research experience.
Expertise in reinforcement learning imitation learning and multimodal representation learning.
Strong proficiency with deep learning frameworks (e.g. PyTorch JAX) and robotics simulation environments (e.g. MuJoCo IsaacSim PyBullet Habitat).
Experience with sim-to-real transfer policy adaptation or continual learning in embodied settings.
Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.
Prior robotics experience with real-world hardware and ML-based robot deployments.

Bonus Qualifications

Prior work on VLA models (e.g. PI0/PI0.5 OpenVLA custom models).
Experience building or managing robot data collection infrastructure.
Familiarity with real-world robot platforms (e.g. Franka Humanoids or mobile manipulators).
Publications in top-tier conferences (CoRL RSS NeurIPS ICLR ICML ICRA CVPR).

The pay range for this position at commencement of employment is expected to be between $176000 and $264000/year for California-based roles. Base pay offered will depend on multiple individualized factors including but not limited to business or organizational needs market location job-related knowledge skills and experience. TRI offers a generous benefits package including medical dental and vision insurance 401(k) eligibility paid time off benefits (including vacation sick time and parental leave) and an annual cash bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer of employment.

We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.

Required Experience:

Overview

Responsibilities

Post-training and adaptation: Design and implement post-training pipelines for VLA models using techniques such as reinforcement learning (RL) reinforcement learning from human or preference feedback (RLHF/RLAIF) in-context learning. Experience with real-world RL is a plus!
Sim-to-real transfer: Develop methods to enhance real-world transferability of policies trained in simulation.
Reset-free and continual learning: Explore and implement reset-free and autonomous data collection strategies that enable continual skill improvement without manual resets or supervision. Learn continually under settings with large-scale long term data collection.
Structured exploration: Investigate exploration algorithms that balance safety curiosity and efficiency for data gathering in both simulation and real-world robotic systems.
Data curation and feedback loops: Lead the design of data collection and curation pipelines for exploration and post-training using multimodal data from demonstrations teleoperation and on-policy rollouts.
Collaborate across teams in perception control and ML infrastructure to deploy scalable and reproducible research systems.
Publish research outcomes and contribute to the open robotics and embodied AI communities.

Qualifications

Ph.D. or M.S. in Robotics Machine Learning Computer Vision or related field or equivalent applied research experience.
Expertise in reinforcement learning imitation learning and multimodal representation learning.
Strong proficiency with deep learning frameworks (e.g. PyTorch JAX) and robotics simulation environments (e.g. MuJoCo IsaacSim PyBullet Habitat).
Experience with sim-to-real transfer policy adaptation or continual learning in embodied settings.
Strong coding and experimental skills with an emphasis on reproducibility and evaluation at scale.
Prior robotics experience with real-world hardware and ML-based robot deployments.

Bonus Qualifications

Prior work on VLA models (e.g. PI0/PI0.5 OpenVLA custom models).
Experience building or managing robot data collection infrastructure.
Familiarity with real-world robot platforms (e.g. Franka Humanoids or mobile manipulators).
Publications in top-tier conferences (CoRL RSS NeurIPS ICLR ICML ICRA CVPR).

Required Experience: