Teach AI how to reason safely transparently and at scale.
How do we move beyond pattern-matching into true machine reasoning This Applied Scientist role puts you at the centre of that challenge developing models that can reason explain their logic and make verifiable decisions across complex high-stakes industries.
Youll join a well-funded startup building domain-specific reasoning systems and agentic AI for sectors like medtech aerospace advanced manufacturing where reliability and interpretability arent optional.
Your work will focus on post-training large multimodal models applying the latest techniques in RLHF DPO and preference learning to make AI systems more consistent factual and aligned with human reasoning. Youll design the frameworks that turn raw model potential into transparent trustworthy intelligence.
Youll develop and optimise post-training pipelines implement reward modelling for reasoning depth and factual accuracy and build evaluation frameworks for verifiable human-aligned behaviour. Working with proprietary and synthetic datasets youll run end-to-end experiments and deploy your methods directly into production.
Youll bring a background in transformer-based model training (LLM VLM MLLM) post-training or alignment (RLHF DPO reward modelling) and strong practical skills in Python and PyTorch. Curiosity about reasoning agents hybrid learning and interpretability research will help you thrive here.
Bonus points for experience in multimodal reasoning evaluation and verification or prior research contributions in alignment or reasoning systems.
The company has raised $20M (Series A announcement imminent) and already partners with Fortune 100 and 500 customers. Founded by an entrepreneur with a prior billion-dollar exit the AI team alone is scaling from 11 to 40 this year.
Comp: $200K$320K base (negotiable depending on experience) bonus stock benefits
Location: SF Bay Area (remote for now; hybrid later in 2026)
If youre excited about defining how AI systems reason decide and explain themselves wed love to hear from you.
All applicants receive a response.
Required Experience:
IC