Research Lead
Berkeley, CA - USA
Job Summary
About Us
Since our founding in July 2022 weve grown to 40 staff published 40 academic papers and convened leading AI safety events. Our work is recognized globally with publications at premier venues such as NeurIPS ICML and ICLR and features in the Financial Times Nature News and MIT Technology Review. We conduct pre-deployment testing on behalf of frontier developers such as OpenAI and independent evaluations for governments including the EU AI Office. We help steer and grow the AI safety field through developing research roadmaps with renowned researchers such as Yoshua Bengio; running an AI safety-focused co-working space in Berkeley housing 40 members; and supporting the community through targeted grants to technical researchers.
About
We explore promising research directions in AI safety and scale up only those showing a high potential for impact. Once the core research problems are solved we work to scale them to a minimum viable prototype demonstrating their validity to AI companies and governments to drive adoption.
Our current research includes:
Adversarial Robustness: working to rigorously solve security problems through building a science of security and robustness for AI from demonstrating superhuman systems can be vulnerable to scaling laws for robustness and jailbreaking constitutional classifiers.
Mechanistic Interpretability: finding issues with Sparse Autoencoders probing deception using AmongUs understanding learned planning in SokoBan and interpretable data attribution.
Red-teaming: conducting pre- and post-release adversarial evaluations of frontier models (e.g. Claude 4 Opus ChatGPT Agent GPT-5); developing novel attacks to support this work.
Evals: developing evaluations for new threat models e.g. persuasion and tampering risks.
Mitigating AI deception: studying when lie detectors induce honesty or evasion and developing approaches to deception and sandbagging.
We are particularly looking to add Research Leads in the following pod shapes:
Applied Interpretability using interpretability to tackle concrete safety problems (better probes backdoor detection deception monitoring) aiming for fast feedback loops often in collaboration with our other pods. A new pod greenfield.
Scalable Oversight / Alignment methods that keep oversight robust as models become more capable than their supervisors: recursive reward modeling debate weak-to-strong generalization process-based supervision. A greenfield area wed like to stand up.
Adversarial Robustness / Guardrails extending our independent-testing work into deployed-system protection: better constitutional classifiers pre-training safety interventions (initially CBRN misuse especially for open-weight models) backdoor detection and mitigation realistic cybersecurity evaluations and loss-of-control deception evaluations.
Auditing / Evals auditing for alignment not just capabilities: evaluation awareness (construct validity safety-relevance hyper-realistic evals) CoT monitorability and faithfulness training black-box monitoring as a complement to our existing white-box work.
Persuasion / Epistemic Risks science of epistemic risks and intervention points persuasions role in loss of control risks evaluations and independent testing connections to broader harmful manipulation solutions and epistemic uplift. Part building on our existing work part shaping your own agenda in the area.
Bring Your Own Agenda an open track for senior researchers with a strong vision outside the pods above.
About the Role
Research Leads define and own a research workstream end-to-end. Day-to-day that means:
Articulate a research agenda with a clear theory of change for mitigating catastrophic risks from human-level or superhuman AI systems and/or vastly increasing the upside of such systems.
Grow and lead a team of technical staff in pursuit of this agenda either directly or in partnership with an engineering co-lead.
Lead novel research projects where there may be unclear markers of progress or success.
Share your research findings through written content (e.g. academic publications blog posts) and presentations (e.g. ML conferences policymaker briefings) to drive adoption and change.
Mentor and coach junior team members in research skills and ML engineering.
Contribute to the intellectual environment for example by giving feedback on early-stage proposals.
This role would be a great fit if you:
Want to work on the most impactful research directions alongside mission-driven colleagues wholl push them forward with you.
Wish to pursue empirically grounded scalable research directions that lean technically strong teams can drive forward.
Value the ability to speak freely. We dont censor our researchers we just ask that you protect confidential information and make clear when youre speaking personally or on behalf of the organization.
Want to advise and collaborate with governments leading AI companies and academics. Were a small organization that punches above its weight by working closely with these partners through red-teaming technical standards work and research collaborations.
This role would be a poor fit if you:
Prefer solo IC research to leading a team toward a shared agenda. Some people can do great research that way but in this role were looking for someone whose research direction is strong enough that other excellent researchers want to build it with them.
Prioritize novelty and intellectual elegance over impact. We care about both a mathematically elegant solution to AI safety would be wonderful but when we have to choose we choose what makes AI safer in practice.
Can only work with the largest compute clusters available at industry labs or need to be compensated with equity in a rapidly growing startup. We offer competitive salaries and sizable compute budgets on a cluster that we manage but if you value these things over having a positive impact on the future then you may be more suited to a for-profit lab.
About You
To be a strong candidate for the Research Lead role you likely:
Have a strong existing research track record in AI or another highly technical subject (e.g. CS math physics).
Have a clear view of which safety research directions are likely to matter most over the next few years and why.
Have either (a) a clear research agenda youd pursue at with a theory of change explaining why its valuable or (b) a strong track record and a research space youd sharpen into an agenda over your first months. We assess both paths against the same bar depth of articulation at application is itself a signal about expected runway.
Have led a team mentored graduate students or supported early-career researchers through fellowship programs. Informal leadership in flatter organizations counts we look at substance not titles.
Can effectively communicate novel methods and solutions to both technical and non-technical audiences.
Hold a PhD or have 2 years research experience in computer science artificial intelligence machine learning or statistics.
It is preferable if you:
Have an established publication record in AI safety.
Are comfortable writing grant proposals and navigating collaborations with other organizations or external research groups.
If you are missing key leadership experience or are earlier in your career we encourage you to consider the open Research Scientist pathway and invite you to contribute to one of our existing agendas.
Were also open to more senior versions of this role; simply apply or reach out to .
Logistics
If based in the USA or Singapore you will be an employee of (501(c)(3) research non-profit / non-profit CLG). Outside the USA or Singapore you will be employed via an EOR organisation on behalf of or as a contractor.
Location: Both remote and in-person (Berkeley CA or Singapore) are possible. We sponsor visas for in-person employees and can hire remotely in most countries.
Hours: Full-time (40 hours/week).
Compensation: $170000$250000/year depending on experience and location with the potential for additional compensation for exceptional candidates. We will also pay for work-related travel and equipment expenses. We offer catered lunch and dinner at our offices in Berkeley.
Application materials: Expect 12 hours of preparation; most carries forward from prior job searches. We ask for a CV a short research direction statement (the form supports both fully-formed agendas and developing ones) 23 selected works with a brief note on your personal contribution and a short note on why is a good home for your direction. If you advance to portfolio review well ask for a full research direction statement (12 pages with a theory of change to real-world implementation; 1.52 hours due within about a week).
Process: From application: a portfolio review (async) a 60-minute bilateral fit call a research deep-day (3.5 hours live including an open talk to FAR research staff and two interview sessions) a 5-day paid work trial structured reference calls and a final decision panel. Typical elapsed time: 46 weeks. Total candidate time end-to-end is 50 hours with the paid work trial being the bulk. If a 5-day block isnt feasible for you reach out we can discuss alternatives.
If you have any questions about the role please do get in touch at .