Research Scientist Post-training

Techire Ai

Not Interested
Bookmark
Report This Job

profile Job Location:

San Francisco, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Job Description

Training builds capability. Post-training decides what it becomes.

This team are rethinking how large multimodal models learn after pre-training developing post-training and reinforcement learning methods that help models reason plan and interact in real time.

Founded by the researchers behind several of the most influential modern AI architectures this lab are pushing alignment and learning efficiency beyond standard RLHF. Theyre scaling preference-based training (RLHF DPO hybrid feedback loops) to new model types and creating systems that learn from interaction rather than static data.

Youll work at the intersection of post-training RL and model architecture designing reward models scalable evaluation frameworks and training strategies that make large-scale learning measurable and reliable. Its applied research with direct impact supported by serious compute and a tight researcher-to-GPU ratio.

Youll bring experience in large-scale post-training or reinforcement learning (RLHF DPO or SFT pipelines) a solid grasp of LLM or multimodal training systems and the curiosity to explore new optimisation and alignment methods. A publication record at top venues (NeurIPS ICLR ICML CVPR ACL) is a plus but impact matters more than titles.

The team are based in San Francisco working mostly in person. $1 million total compensation. Base salary circa $300K $600K (negotiable) plus stock and bonus exact package depends on experience.

If you want to work where post-training meets architecture shaping how foundation models learn reason and adapt this is that opportunity.

All applicants will receive a response.


Required Experience:

IC

Job DescriptionTraining builds capability. Post-training decides what it becomes.This team are rethinking how large multimodal models learn after pre-training developing post-training and reinforcement learning methods that help models reason plan and interact in real time.Founded by the researcher...
View more view more

Key Skills

  • Laboratory Experience
  • Machine Learning
  • Python
  • AI
  • Bioinformatics
  • C/C++
  • R
  • Biochemistry
  • Research Experience
  • Natural Language Processing
  • Deep Learning
  • Molecular Biology