At Toyota Research Institute (TRI) were on a mission to improve the quality of human life. Were developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility weve built a world-class team in Automated Driving Energy & Materials Human-Centered AI Human Interactive Driving Large Behavior Models and Robotics.
The Team
The Learning From Videos (LFV) team in the Robotics division focuses on the development of foundation models capable of leveraging large-scale multi-modal (RGB depth flow semantics bounding boxes tactile audio etc) data from multiple domains (driving robotics indoors outdoors etc) to improve the performance of downstream tasks. This paradigm targets training scalability since data from multiple modalities can be equally leveraged to learn useful data-driven priors (3D geometry physics dynamics etc) for world understanding. Our topics of interest include but are not limited to Video Generation World Models 4D Reconstruction Multi-Modal Models Multi-View Geometry Data Augmentation and Video-Language-Action models with a primary focus on embodied applications. We are aiming to make progress on some of the hardest scientific challenges around spatio-temporal reasoning and how it can lead to the deployment of autonomous agents in real-world unstructured environments.
The Postdoc
This year-long postdoctoral research position will be highly integrated into our team with hands in both ongoing and new research and development threads in the areas of:
4D World Models
Physical and Embodied Intelligence
Multi-Modal Learning
This researcher will have the opportunity to work collaboratively with our team at TRI on high-risk high-reward projects pushing forward our understanding of spatio-temporal reasoning and zero-shot generalization. This is a research-focused position targeting the development of methods and techniques that can solve real-world problems. We welcome you to join a positive friendly and enthusiastic team of researchers where you will contribute to helping people gain and maintain independence access and mobility. We work closely with other Toyota affiliates and actively collaborate towards research publications and the productization of our developed technologies.
Responsibilities
- Develop integrate and deploy algorithms for Multi-Modal and 4D reasoning targeting physical applications.
- Handle the ingestion of large-scale datasets for training including streaming online and continual learning.
- Invent and deploy innovative solutions at the intersection of machine learning computer vision and robotics that improve the real-world performance of useful tasks.
- Work closely with robotics and machine learning researchers and engineers to understand theoretical and practical needs.
- Follow best practices producing maintainable code both for internal use as well as for open-sourcing to the scientific community.
Qualifications
- Ph.D. in a relevant technical field.
- A strong background in computer vision and its applications to robotics and embodied systems.
- A standout colleague with strong communication skills and an ability to learn from others and contribute back to the scientific community with publications or open source code.
- Passionate about assisting and amplifying older adults and those in need through dexterous manipulation human-robot collaboration and physical assistance innovation.
Bonus Qualifications
- Spatio-temporal (4D) computer vision including multi-view geometry 3D/4D reconstruction video generation self-supervised learning occlusion reasoning etc.
- Large-scale training of multi-modal deep learning methods both in terms of dataset sizes and model complexity context length extension and efficient attention distributed computing etc.
- Application of machine learning and computer vision to embodied applications.
The pay range for this position at commencement of employment is expected to be between$176000and$264000/year for California-based roles; however base pay offered may vary depending on multiple individualized factors including market location job-related knowledge skills and experience. Note that TRI offers a generous benefits package (including 401(k) eligibility and various paid time off benefits such as vacation sick time and parental leave) and an annual cash bonus structure. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.
Please reference thisCandidate Privacy Noticeto inform you of the categories of personal information that we collect from individuals who inquire about and/or apply to work for Toyota Research Institute Inc. or its subsidiaries including Toyota A.I. Ventures GP L.P. and the purposes for which we use such personal information.
TRI is fueled by a diverse and inclusive community of people with unique backgrounds education and life experiences. We are dedicated to fostering an innovative and collaborative environment by living the values that are an essential part of our culture. We believe diversity makes us stronger and are proud to provide Equal Employment Opportunity for all without regard to an applicants race color creed gender gender identity or expression sexual orientation national origin age physical or mental disability medical condition religion marital status genetic information veteran status or any other status protected under federal state or local laws.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. Pursuant to the San Francisco Fair Chance Ordinance we will consider qualified applicants with arrest and conviction records for employment.
We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.
At Toyota Research Institute (TRI) were on a mission to improve the quality of human life. Were developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility weve built a world-class team in Automated Driving Energy & Materials Human-Centered AI...
At Toyota Research Institute (TRI) were on a mission to improve the quality of human life. Were developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility weve built a world-class team in Automated Driving Energy & Materials Human-Centered AI Human Interactive Driving Large Behavior Models and Robotics.
The Team
The Learning From Videos (LFV) team in the Robotics division focuses on the development of foundation models capable of leveraging large-scale multi-modal (RGB depth flow semantics bounding boxes tactile audio etc) data from multiple domains (driving robotics indoors outdoors etc) to improve the performance of downstream tasks. This paradigm targets training scalability since data from multiple modalities can be equally leveraged to learn useful data-driven priors (3D geometry physics dynamics etc) for world understanding. Our topics of interest include but are not limited to Video Generation World Models 4D Reconstruction Multi-Modal Models Multi-View Geometry Data Augmentation and Video-Language-Action models with a primary focus on embodied applications. We are aiming to make progress on some of the hardest scientific challenges around spatio-temporal reasoning and how it can lead to the deployment of autonomous agents in real-world unstructured environments.
The Postdoc
This year-long postdoctoral research position will be highly integrated into our team with hands in both ongoing and new research and development threads in the areas of:
4D World Models
Physical and Embodied Intelligence
Multi-Modal Learning
This researcher will have the opportunity to work collaboratively with our team at TRI on high-risk high-reward projects pushing forward our understanding of spatio-temporal reasoning and zero-shot generalization. This is a research-focused position targeting the development of methods and techniques that can solve real-world problems. We welcome you to join a positive friendly and enthusiastic team of researchers where you will contribute to helping people gain and maintain independence access and mobility. We work closely with other Toyota affiliates and actively collaborate towards research publications and the productization of our developed technologies.
Responsibilities
- Develop integrate and deploy algorithms for Multi-Modal and 4D reasoning targeting physical applications.
- Handle the ingestion of large-scale datasets for training including streaming online and continual learning.
- Invent and deploy innovative solutions at the intersection of machine learning computer vision and robotics that improve the real-world performance of useful tasks.
- Work closely with robotics and machine learning researchers and engineers to understand theoretical and practical needs.
- Follow best practices producing maintainable code both for internal use as well as for open-sourcing to the scientific community.
Qualifications
- Ph.D. in a relevant technical field.
- A strong background in computer vision and its applications to robotics and embodied systems.
- A standout colleague with strong communication skills and an ability to learn from others and contribute back to the scientific community with publications or open source code.
- Passionate about assisting and amplifying older adults and those in need through dexterous manipulation human-robot collaboration and physical assistance innovation.
Bonus Qualifications
- Spatio-temporal (4D) computer vision including multi-view geometry 3D/4D reconstruction video generation self-supervised learning occlusion reasoning etc.
- Large-scale training of multi-modal deep learning methods both in terms of dataset sizes and model complexity context length extension and efficient attention distributed computing etc.
- Application of machine learning and computer vision to embodied applications.
The pay range for this position at commencement of employment is expected to be between$176000and$264000/year for California-based roles; however base pay offered may vary depending on multiple individualized factors including market location job-related knowledge skills and experience. Note that TRI offers a generous benefits package (including 401(k) eligibility and various paid time off benefits such as vacation sick time and parental leave) and an annual cash bonus structure. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.
Please reference thisCandidate Privacy Noticeto inform you of the categories of personal information that we collect from individuals who inquire about and/or apply to work for Toyota Research Institute Inc. or its subsidiaries including Toyota A.I. Ventures GP L.P. and the purposes for which we use such personal information.
TRI is fueled by a diverse and inclusive community of people with unique backgrounds education and life experiences. We are dedicated to fostering an innovative and collaborative environment by living the values that are an essential part of our culture. We believe diversity makes us stronger and are proud to provide Equal Employment Opportunity for all without regard to an applicants race color creed gender gender identity or expression sexual orientation national origin age physical or mental disability medical condition religion marital status genetic information veteran status or any other status protected under federal state or local laws.
It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability. Pursuant to the San Francisco Fair Chance Ordinance we will consider qualified applicants with arrest and conviction records for employment.
We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.
View more
View less