Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via email$ 200 - 250
1 Vacancy
Ready to build AI agents that actually solve real-world problems
Join a well-funded AI startup developing the next generation of agentic AI systems - general-purpose models capable of solving real-world user problems end-to-end with minimal prompting.
Theyre building an agentic AI desktop assistant that integrates deeply with operating systems positioning as a potential competitor to OpenAIs operator. With over $40 million in funding theyre pushing the boundaries of whats possible in LLMs reinforcement learning and autonomous AI systems.
This role offers the rare opportunity to work on foundational AI research that directly translates into practical applications. Youll be building technology that transforms how people interact with computers creating AI agents that dont just talk but actually perform complex tasks autonomously.
Your focus:
Youll join the foundational AI research team tackling some of the most challenging problems in AI today. Your work will span from theoretical research to practical implementation with the goal of building core models that surpass existing capabilities while maintaining highest accuracy and longest possible context windows.
As a Research Engineer youll design and implement cutting-edge LLM architectures specifically for agentic behaviour and real-world task solving. Youll own components of the RL training pipeline including reward model development evaluation and deployment. Working with high-quality datasets for supervised fine-tuning and RL will be central to your role as will experimenting with novel training algorithms for agentic systems.
Youll contribute to benchmark design and evaluation strategies for both reward models and policy models staying at the frontier of LLM RL and agent research whilst turning theory into scalable systems that can be deployed at scale.
You should have:
A deep understanding of LLM post-training fine-tuning and evaluation especially in the context of agents. Experience designing or training reward models for alignment or task optimisation is essential along with hands-on experience implementing RLHF DPO or similar methods to align language models with human feedback.
Youll need strong intuition for dataset quality - knowing how to spot problematic data and craft effective supervision. Fluency with PyTorch Hugging Face and modern LLM libraries is crucial as is the ability to prototype and scale new ideas quickly.
The ideal candidate can implement novel training algorithms architectures or evaluation strategies from research papers or original ideas bringing both theoretical understanding and practical engineering skills to complex AI challenges.
Nice to have:
Experience scaling LLMs beyond 10B parameters would be valuable as would prior research experience in LLMs RL or reward learning. Publications in relevant venues are appreciated but practical implementation experience is equally valued.
What they offer:
This is an opportunity to work with a top-tier team of researchers and engineers building aligned general AI systems that are truly helpful and deployable. Your work will directly shape how AI interacts with the real world moving beyond conversational interfaces to systems that perform meaningful tasks.
The compensation reflects the senior nature of the role: $200k-$250k base salary (negotiable based on experience) plus significant equity in a well-funded company positioned at the forefront of agentic AI. They provide comprehensive benefits and visa sponsorship for exceptional candidates.
Office based near Palo Alto theyre looking for energetic fast-paced individuals who bring fresh excitement to the space and want to push the boundaries of whats possible with autonomous AI systems.
If youre excited about building AI agents that move beyond conversation to actual task completion this could be your opportunity to make a lasting impact on the future of human-computer interaction.
Ready to help define the next generation of AI All applicants will receive a response.
Full-Time