SeniorStaff Site Reliability Engineer

Mochi Health

Not Interested
Bookmark
Report This Job

profile Job Location:

San Francisco, CA - USA

profile Monthly Salary: $ 230000 - 280000
Posted on: 19 hours ago
Vacancies: 1 Vacancy

Job Summary

Mochi Healths mission is to be the discovery layer of healthcare. We are building a platform that makes it easier for patients to find the right providers access the right medications and take control of their health with transparency and trust.

Over the past few years we have experienced rapid growth by combining operational excellence clinical expertise and innovative technology to deliver care that is more human intuitive and effective. From pharmacy pricing transparency and personalized medication management to long-term medical record access and community-based chronic illness support Mochi is creating a new model of care that empowers patients providers and pharmacies alike.

We believe the future of healthcare is personal and we are building the technology to power it. At Mochi Health you will join a team that values inclusivity collaboration and bold thinking and you will have the opportunity to do the most meaningful work of your career.

$230000 - $280000
Full-time / Onsite (5 days/week)

About The Role

Were looking for a Senior/Staff Site Reliability Engineer to build Mochis AI-driven APM and incident management system that alert and page but learns. This is a foundational role at the intersection of SRE platform engineering and applied AI: youll design the feedback loops (human-in-the-loop / RLHF-style) guardrails and automation that let our reliability posture improve over time.

Youll own the systems and workflows that turn incidents into intelligence: automated triage root cause analysis remediation and bug-fix proposals (PRs test runs staged rollouts) when issues are code-level.

If youre excited by the idea of building a self-improving SRE copilot this job is for you.

What Youll Do

  • Build an AI-driven SRE platform that ingests telemetry (logs/metrics/traces) deploy events and incident artifacts to detect anomalies summarize failures and propose mitigations.

  • Design ahuman-in-the-loop learning loop (RLHF-style) so the system gets better with every incident: capturing decisions outcomes and postmortems into training/evaluation data.

  • Create safe auto-remediation capabilities: runbook execution automated rollbacks feature-flag actions with strong guardrails auditability and progressive rollout controls.

  • Build tooling that can propose bug fixes: generate well-scoped PRs run tests support canary releaseswith clear handoff and approval flows.

  • Define and operationalize SLOs/SLIs and error budgets for critical user journeys (patient onboarding provider workflows pharmacy fulfillment billing etc.).

  • Level up observability end-to-end: alert quality dashboarding tracing standards and unknown unknown detection.

  • Lead incident response excellence: on-call improvements incident command blameless postmortems and driving systemic fixes that reduce repeat failures.

  • Partner with product engineering teams to reduce toil and improve reliability via better architecture load testing resilience testing and capacity planning.

  • Establish reliability standards and patterns across the org (golden signals deployment safety dependency management fault isolation).

Who You Are

  • 7 years in SRE / platform / infrastructure engineering with a track record of owning production reliability at scale.

  • Deep experience operating Kubernetes-based systems in the cloud (AWS preferred) including networking autoscaling rollout strategies and incident mitigation.

  • Strong software engineering abilityyou can debug production issues across services understand failure modes and contribute code when needed (Python/Go/TypeScript are all great).

  • Expert-level grasp of observability and incident response: metrics logs tracing alerting design and postmortem-driven improvements.

  • Comfortable building automation that touches productionand obsessive about safety: least-privilege access audit logs approvals canaries and rollback.

  • Excited by AI tooling and agentic workflows (or already experienced): LLM-based triage/summarization retrieval over runbooks/postmortems evaluation harnesses and feedback loops.

  • Strong communication and collaboration skillsyou can lead during incidents write clearly and align teams around reliability priorities.

  • Startup mindset: you move fast take end-to-end ownership and love turning ambiguity into shipped systems.

  • Excited to work in-person with our team in San Francisco.

Nice to Haves

  • Experience building LLM-powered internal tools (incident copilots automated debugging RAG over docs/runbooks) and/or RLHF-style feedback pipelines.

  • Familiarity with security and compliance in regulated environments (HIPAA SOC 2 audit requirements PHI handling).

  • Experience with chaos engineering / game days and resilience testing programs.

  • Experience building CI/CD guardrails and progressive delivery systems (canaries automated verification safe rollout policies).

  • Prior work on distributed tracing standards (OpenTelemetry) service meshes or large-scale event-driven systems.

Our Core Technologies Include: AWS Kubernetes Postgres Redis TypeScript/ Python SQL (plus whatever we need to build a world-class reliability platform)

Life at Mochi

At Mochi we believe your best work happens when you feel your bestso weve designed an environment that fuels your creativity supports your growth and makes every day exciting.

Daily Meals and Espresso Bar -Breakfast lunch and dinner every weekday. Our on-site barista keeps the espresso and matcha flowing all day

Pre-Tax Commuter Perks -Save on transit and parking through pre-tax commuter benefits

Top-of-Market Compensation -We offer competitive salaries along with generous equity packages so you can share in the success you help create

Profitable and Rapid Growth -Were scaling fast with financial discipline and long-term vision. No VC constraints just sustainable momentum and smart decisions

High-Impact Work -Help shape the future of digital healthcare. Your work here directly improves lives and scales nationwide

World-Class Team -Collaborate with teammates from Tesla SpaceX Citadel Harvard IIT and more. We value excellence humility and empathy in equal measure

Comprehensive Benefits -401(k) with match generous time off life insurance and high-quality medical dental and vision plans

Mochi Health Membership We cover your monthly subscription fee so you can experience the same care as our patients (medications not included)

Time to RechargeEnjoy unlimited PTO generous company holidays and true flexibility. We trust you to take the time you need to rest reset and thrive

Wellness FirstFrom weekly mindfulness sessions to group workouts and fitness perks your physical and mental health are top priority

Team Socials and Community -We make time to connect through regular socials happy hours and spontaneous events. Our stocked kitchen doesnt hurt either

Downtown SF HQ -Our San Francisco office is just steps from BART Muni and great food. Its designed for deep work and casual collaboration

--

The base salary for this full-time position ranges from $230000 to $280000 in addition to equity and benefits. The salary range listed in each job posting represents the minimum and maximum targets for new hire salaries across all locations. Actual compensation within this range is determined by various factors such as job-related skills experience relevant education or training and location.

#LI-Onsite #LI-AK1

Workplace Policy

Mochi Health is an in-person company based in San Francisco CA. Our team works together in person five days a week to foster collaboration innovation and strong connections. We believe that face-to-face interaction builds a culture of excellence and allows us to deliver the best outcomes for the patients and providers we serve.

For office-based roles the standard schedule is Monday through Friday 9:00 a.m. to 7:00 p.m. Actual hours may vary depending on business needs and role responsibilities. All employees receive meal and rest breaks in accordance with applicable state and local laws.

For designated remote roles this in-person policy does not apply.

Equal Opportunity

Mochi Health is an Equal Opportunity Employer. We make all employment decisions based solely on merit. We provide equal employment opportunities to all applicants and employees without discrimination on the basis of race religion color national origin gender (including pregnancy childbirth or related medical conditions) sexual orientation gender identity gender expression age status as a protected veteran disability status or any other applicable legally protected characteristic. We prohibit any form of discrimination or harassment. This policy applies to all terms and conditions of employment including hiring.

Candidate Privacy Notice

Please review Mochi Healths Candidate Privacy Notice here.

Accommodations

Mochi Health complies with the Americans with Disabilities Act (ADA) as amended by the ADA Amendments Act and all applicable state or local laws. We will reasonably accommodate qualified individuals with a disability during the application process and throughout employment as required by law.

If you need any assistance or accommodations due to a disability please contact us at .


Required Experience:

Staff IC

Mochi Healths mission is to be the discovery layer of healthcare. We are building a platform that makes it easier for patients to find the right providers access the right medications and take control of their health with transparency and trust.Over the past few years we have experienced rapid growt...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting