Help us push the boundaries of whats possible in LLM post-training. If you love training models exploring new architectures running experiments and turning research insights into products that ship wed love to meet you.
About
trains and hosts specialized language models for companies who want frontier-quality AI at a fraction of the cost. The models we train match GPT-5 accuracy but are smaller faster and up to 90% cheaper. Our platform handles everything end-to-end: distillation training evaluation and planet-scale hosting.
We are a well-funded ten-person team of engineers who work in-person in downtown San Francisco on difficult high-impact engineering problems. Everyone on the team has been writing code for over 10 years and has founded and run their own software companies. We are high-agency adaptable and collaborative. We value creativity alongside technical prowess and humility. We work hard and deeply enjoy the work that we do. Most of us are in the office 4 days a week in SF; hybrid works for Bay Area candidates.
About the Role
You will be responsible for conducting research into experimental models training systems and modalities to create novel products for our customers. Your work will span from exploring new architectures and learning methods to optimizing latency and efficiency with the goal of delivering better models to customers.
Your north star is pushing the frontier of whats possible in LLM post-training. Youll explore new techniques run rigorous experiments and when something works help bring it into production with the help of your teammates. This includes training models for customers and running evaluations as part of validating your research. This role reports directly to the founding team. Youll have the autonomy a large compute budget / GPU reservation and technical support to explore ambitious ideas and ship the ones that work.
Key Responsibilities
Research and experiment with new model architectures to improve quality efficiency or capability
Explore methods to decrease inference latency and improve serving efficiency
Run experiments with new learning methods including novel approaches to SFT RLHF DPO and other post-training techniques
Perform reinforcement learning research to improve model alignment and capability
Develop and improve our distillation pipeline for training high-quality models from frontier teachers
Train models for clients and run evaluations to validate research findings in production settings
Create robust benchmarks and evaluation frameworks that ensure custom models match or exceed frontier performance
Stay current with ML research and identify techniques that can improve our platform
Collaborate with applied engineers to bring successful research into production systems
Document findings and share knowledge with the team
Requirements
3 years of experience training AI models using PyTorch
Deep understanding of transformer architectures attention mechanisms and model internals
Hands-on experience with post-training LLMs using SFT RLHF DPO or other alignment techniques
Experience with LLM-specific training frameworks (e.g. Hugging Face Transformers DeepSpeed Megatron TRL or similar)
Strong experimental methodology including ability to design run and analyze rigorous experiments
Track record of implementing ideas from recent ML papers
Experience training on NVIDIA GPUs at scale
Strong foundation in ML fundamentals: optimization loss functions regularization generalization
Nice-to-Have
Publications in ML venues
Experience with model distillation or knowledge transfer
Experience with LLM speed optimization techniques
Familiarity with vision encoders multimodal models or other modalities
Experience with distributed training and infrastructure at scale
Contributions to open-source ML projects
You dont need to tick every box. Curiosity and the ability to learn quickly matter more.
Compensation
We offer competitive compensation equity in a high-growth startup and comprehensive benefits. The base salary range for this role is $250000 - $350000 plus equity and benefits depending on experience.
Equal Opportunity
is an equal opportunity employer. We welcome applicants from all backgrounds and dont discriminate based on race color religion gender sexual orientation national origin genetics disability age or veteran status.
If youre excited about pushing the boundaries of custom AI research wed love to hear from you. Please send your resume and GitHub to and/or here on Ashby.
Required Experience:
IC
Help us push the boundaries of whats possible in LLM post-training. If you love training models exploring new architectures running experiments and turning research insights into products that ship wed love to meet you.About trains and hosts specialized language models for companies who want fronti...
Help us push the boundaries of whats possible in LLM post-training. If you love training models exploring new architectures running experiments and turning research insights into products that ship wed love to meet you.
About
trains and hosts specialized language models for companies who want frontier-quality AI at a fraction of the cost. The models we train match GPT-5 accuracy but are smaller faster and up to 90% cheaper. Our platform handles everything end-to-end: distillation training evaluation and planet-scale hosting.
We are a well-funded ten-person team of engineers who work in-person in downtown San Francisco on difficult high-impact engineering problems. Everyone on the team has been writing code for over 10 years and has founded and run their own software companies. We are high-agency adaptable and collaborative. We value creativity alongside technical prowess and humility. We work hard and deeply enjoy the work that we do. Most of us are in the office 4 days a week in SF; hybrid works for Bay Area candidates.
About the Role
You will be responsible for conducting research into experimental models training systems and modalities to create novel products for our customers. Your work will span from exploring new architectures and learning methods to optimizing latency and efficiency with the goal of delivering better models to customers.
Your north star is pushing the frontier of whats possible in LLM post-training. Youll explore new techniques run rigorous experiments and when something works help bring it into production with the help of your teammates. This includes training models for customers and running evaluations as part of validating your research. This role reports directly to the founding team. Youll have the autonomy a large compute budget / GPU reservation and technical support to explore ambitious ideas and ship the ones that work.
Key Responsibilities
Research and experiment with new model architectures to improve quality efficiency or capability
Explore methods to decrease inference latency and improve serving efficiency
Run experiments with new learning methods including novel approaches to SFT RLHF DPO and other post-training techniques
Perform reinforcement learning research to improve model alignment and capability
Develop and improve our distillation pipeline for training high-quality models from frontier teachers
Train models for clients and run evaluations to validate research findings in production settings
Create robust benchmarks and evaluation frameworks that ensure custom models match or exceed frontier performance
Stay current with ML research and identify techniques that can improve our platform
Collaborate with applied engineers to bring successful research into production systems
Document findings and share knowledge with the team
Requirements
3 years of experience training AI models using PyTorch
Deep understanding of transformer architectures attention mechanisms and model internals
Hands-on experience with post-training LLMs using SFT RLHF DPO or other alignment techniques
Experience with LLM-specific training frameworks (e.g. Hugging Face Transformers DeepSpeed Megatron TRL or similar)
Strong experimental methodology including ability to design run and analyze rigorous experiments
Track record of implementing ideas from recent ML papers
Experience training on NVIDIA GPUs at scale
Strong foundation in ML fundamentals: optimization loss functions regularization generalization
Nice-to-Have
Publications in ML venues
Experience with model distillation or knowledge transfer
Experience with LLM speed optimization techniques
Familiarity with vision encoders multimodal models or other modalities
Experience with distributed training and infrastructure at scale
Contributions to open-source ML projects
You dont need to tick every box. Curiosity and the ability to learn quickly matter more.
Compensation
We offer competitive compensation equity in a high-growth startup and comprehensive benefits. The base salary range for this role is $250000 - $350000 plus equity and benefits depending on experience.
Equal Opportunity
is an equal opportunity employer. We welcome applicants from all backgrounds and dont discriminate based on race color religion gender sexual orientation national origin genetics disability age or veteran status.
If youre excited about pushing the boundaries of custom AI research wed love to hear from you. Please send your resume and GitHub to and/or here on Ashby.
Required Experience:
IC
View more
View less