Technical Program Manager, Safeguards – Infrastructure & Evals

San Francisco, CA - USA

Monthly Salary: $ 290000 - 365000

Posted on: 19 hours ago

Vacancies: 1 Vacancy

Job Summary

About Anthropic

Anthropics mission is to create reliable interpretable and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers engineers policy experts and business leaders working together to build beneficial AI systems.

About the Role

Safeguards Engineering builds and operates the infrastructure that keeps Anthropics AI systems safe in production the classifiers detection pipelines evaluation platforms and monitoring systems that sit between our models and the real world. That infrastructure needs to be not just correct but reliable: when a safety-critical pipeline goes down or degrades the consequences can be serious and they can be invisible until someone looks closely.

As a Technical Program Manager for Safeguards Infrastructure and Evals youll own the operational health and forward momentum of this stack. Your primary responsibility is driving reliability owning the incident-response and post-mortem process ensuring SLOs are defined and met in partnership with various teams and making sure that when things go wrong the right people know the right actions get taken and those actions actually get closed out. Alongside that ongoing operational rhythm youll coordinate the larger platform investments: migrations eval-platform improvements and the cross-team dependencies that connect them.

This role sits at the intersection of operations and program management. It requires genuine technical depth you need to understand how these systems work well enough to triage effectively judge whats actually safety-critical versus what can wait and have informed conversations with the engineers building and maintaining them. But the core of the job is keeping the machine running well and the work moving.

What Youll Do:

Own the Safeguards Engineering ops review - Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures bringing visibility to reliability trends and making sure the right people are in the room when decisions need to be made. This is the heartbeat of how Safeguards Eng stays ahead of operational risk.
Drive incident tracking and post-mortem execution - When incidents happen and in this space they happen regularly youll make sure they get followed through properly. That means tracking incidents across the organization (including those owned by partner teams like Inference) ensuring post-mortems get written and most critically making sure the action items that come out of them actually get done. Closing the loop on post-mortem actions is one of the highest-leverage things this role does.
Establish and maintain SLOs with partner teams - Work with Safeguards Engineering teams and key partners particularly Inference and Cloud Inference to define service-level objectives for safety-critical pipelines. Then build the tracking and reporting that makes it possible to tell whether those SLOs are being met and surface it when theyre not.
Maintain runbook quality and incident-ownership clarity - Safety-critical systems need clear playbooks for when things go wrong. Partner with engineering leads to keep runbooks accurate actionable and up to date and ensure that ownership of incidents (including for areas like account-banning false positives and CSAM detection) is unambiguous so that nothing falls through the cracks during an active incident.
Drive platform migrations and infrastructure projects - Own the program management for the larger infrastructure work on the roadmap: migrating the infra from one platform to the next moving from one incident platform to the next and from one cloud system monitoring to another and other migrations as they come. These are cross-team efforts with real dependencies your job is to keep them sequenced on track and connected to the teams that need them.
Coordinate evals platform improvements - Partner with the evals engineering team to drive improvements to the evaluation platform including self-serve capabilities and the broader eval factory infrastructure. Help scope the work track dependencies on other Safeguards systems and make sure the evals platform is keeping pace with the teams needs.

You might be a good fit if you:

Have solid technical program management experience particularly in operational or infrastructure-heavy environments youre comfortable owning a mix of ongoing operational cadences and discrete project work simultaneously.
Understand how production ML systems work well enough to triage incidents intelligently and have substantive conversations with engineers about whats going wrong and why you dont need to write the code but you need to follow the technical thread.
Are energized by closing loops. Post-mortem action items that never get done SLOs that no one checks runbooks that go stale these things bother you and you know how to build the processes and follow-ups that fix them.
Can work effectively across team boundaries comfortable coordinating with partner teams (like Inference) where you dont have direct authority and skilled at keeping shared work moving through influence and clear communication.
Thrive in environments where the work shifts between keep the lights on and build something new and can context-switch between incident follow-ups and longer-horizon platform projects without dropping either.
Have experience with or strong interest in AI safety you understand why the reliability of a safety-critical pipeline is a different kind of problem than the reliability of a product feature and that distinction motivates you.

Strong candidates may also:

Have experience with SRE practices incident management frameworks or on-call operations at scale.
Have worked on or with evaluation infrastructure for ML systems understanding how evals get designed run and interpreted.
Have experience driving infrastructure migrations in complex multi-team environments particularly where the migration touches operational systems that cant go offline.
Be familiar with monitoring and alerting tooling (PagerDuty Datadog or equivalents) and the operational culture around them.

Deadline to apply:None applications will be received on a rolling basis.

The annual compensation range for this role is listed below.

For sales roles the range provided is the roles On Target Earnings (OTE) range meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.

Annual Salary:

$290000 - $365000 USD

Logistics

Education requirements: We require at least a Bachelors degree in a related field or equivalent experience.

Location-based hybrid policy: Currently we expect all staff to be in one of our offices at least 25% of the time. However some roles may require more time in our offices.

Visa sponsorship:We do sponsor visas! However we arent able to successfully sponsor visas for every role and every candidate. But if we make you an offer we will make every reasonable effort to get you a visa and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy so we urge you not to exclude yourself prematurely and to submit an application if youre interested in this work. We think AI systems like the ones were building have enormous social and ethical implications. We think this makes representation even more important and we strive to include a range of diverse perspectives on our team.

Your safety matters to us. To protect yourself from potential scams remember that Anthropic recruiters only contact you some cases we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money fees or banking information before your first day. If youre ever unsure about a communication dont click any linksvisit for confirmed position openings.

How were different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact advancing our long-term goals of steerable trustworthy AI rather than work on smaller and more specific puzzles. We view AI research as an empirical science which has as much in common with physics and biology as with traditional efforts in computer science. Were an extremely collaborative group and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic including: GPT-3 Circuit-Based Interpretability Multimodal Neurons Scaling Laws AI & Compute Concrete Problems in AI Safety and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits optional equity donation matching generous vacation and parental leave flexible working hours and a lovely office space in which to collaborate with colleagues. Guidance on Candidates AI Usage:Learn aboutour policyfor using AI in our application process

Required Experience:

Manager

About AnthropicAnthropics mission is to create reliable interpretable and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers engineers policy experts and business leaders working together t...

About Anthropic

About the Role

What Youll Do:

Own the Safeguards Engineering ops review - Drive the recurring cadence that keeps the team informed and coordinated: surfacing recent incidents and failures bringing visibility to reliability trends and making sure the right people are in the room when decisions need to be made. This is the heartbeat of how Safeguards Eng stays ahead of operational risk.
Drive incident tracking and post-mortem execution - When incidents happen and in this space they happen regularly youll make sure they get followed through properly. That means tracking incidents across the organization (including those owned by partner teams like Inference) ensuring post-mortems get written and most critically making sure the action items that come out of them actually get done. Closing the loop on post-mortem actions is one of the highest-leverage things this role does.
Establish and maintain SLOs with partner teams - Work with Safeguards Engineering teams and key partners particularly Inference and Cloud Inference to define service-level objectives for safety-critical pipelines. Then build the tracking and reporting that makes it possible to tell whether those SLOs are being met and surface it when theyre not.
Maintain runbook quality and incident-ownership clarity - Safety-critical systems need clear playbooks for when things go wrong. Partner with engineering leads to keep runbooks accurate actionable and up to date and ensure that ownership of incidents (including for areas like account-banning false positives and CSAM detection) is unambiguous so that nothing falls through the cracks during an active incident.
Drive platform migrations and infrastructure projects - Own the program management for the larger infrastructure work on the roadmap: migrating the infra from one platform to the next moving from one incident platform to the next and from one cloud system monitoring to another and other migrations as they come. These are cross-team efforts with real dependencies your job is to keep them sequenced on track and connected to the teams that need them.
Coordinate evals platform improvements - Partner with the evals engineering team to drive improvements to the evaluation platform including self-serve capabilities and the broader eval factory infrastructure. Help scope the work track dependencies on other Safeguards systems and make sure the evals platform is keeping pace with the teams needs.

You might be a good fit if you:

Have solid technical program management experience particularly in operational or infrastructure-heavy environments youre comfortable owning a mix of ongoing operational cadences and discrete project work simultaneously.
Understand how production ML systems work well enough to triage incidents intelligently and have substantive conversations with engineers about whats going wrong and why you dont need to write the code but you need to follow the technical thread.
Are energized by closing loops. Post-mortem action items that never get done SLOs that no one checks runbooks that go stale these things bother you and you know how to build the processes and follow-ups that fix them.
Can work effectively across team boundaries comfortable coordinating with partner teams (like Inference) where you dont have direct authority and skilled at keeping shared work moving through influence and clear communication.
Thrive in environments where the work shifts between keep the lights on and build something new and can context-switch between incident follow-ups and longer-horizon platform projects without dropping either.
Have experience with or strong interest in AI safety you understand why the reliability of a safety-critical pipeline is a different kind of problem than the reliability of a product feature and that distinction motivates you.

Strong candidates may also:

Have experience with SRE practices incident management frameworks or on-call operations at scale.
Have worked on or with evaluation infrastructure for ML systems understanding how evals get designed run and interpreted.
Have experience driving infrastructure migrations in complex multi-team environments particularly where the migration touches operational systems that cant go offline.
Be familiar with monitoring and alerting tooling (PagerDuty Datadog or equivalents) and the operational culture around them.

Deadline to apply:None applications will be received on a rolling basis.

The annual compensation range for this role is listed below.

For sales roles the range provided is the roles On Target Earnings (OTE) range meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.

Annual Salary:

$290000 - $365000 USD

Logistics

How were different

Come work with us!

Required Experience:

Manager

Key Skills

Project Management Methodology
Project / Program Management
Program Management
Management Experience
Microsoft Powerpoint
Project Management
Microsoft Project
Budgeting
DoD Experience
Leadership Experience
Supervising Experience
Contracts

Apply Now

About Company

Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Technical Program Manager, Safeguards – Infrastructure & Evals

San Francisco, CA - USA

Job Summary

About Anthropic

About the Role

What Youll Do:

You might be a good fit if you:

Strong candidates may also:

Logistics

How were different

Come work with us!

About Anthropic

About the Role

What Youll Do:

You might be a good fit if you:

Strong candidates may also:

Logistics

How were different

Come work with us!

Key Skills

About Company

Related Jobs