Senior Director of AI Infrastructure & Engineering

Not Interested
Bookmark
Report This Job

profile Job Location:

Redwood City, CA - USA

profile Monthly Salary: $ 435000 - 621500
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

The Chan Zuckerberg Initiative was founded in 2015 by Priscilla Chan and Mark Zuckerberg to help solve some of societys toughest challenges from curing or preventing disease to improving education and addressing the needs of our local communities. We provide the operational support across our areas of work.

The Team

CZI supports the science and technology that will make it possible to help scientists cure prevent or manage all diseases by the end of this century. While this may seem like an audacious goal in the last 100 years biomedical science has made tremendous strides in understanding biological systems advancing human health and treating disease.

Achieving our mission will only be possible if scientists are able to better understand human biology. To that end we have identified four grand challenges that will unlock the mysteries of the cell and how cells interact within systems paving the way for new discoveries that will change medicine in the decades that follow:

  • Building an AI-based virtual cell model to predict and understand cellular behavior
  • Developing state-of-the-art imaging systems to observe living cells in action
  • Instrumenting tissues to better understand inflammation a key driver of many diseases
  • Engineering and harnessing the immune system for early detection prevention and treatment of disease

CZIswork in science includes grantmaking programs open-source software development and close collaboration with the Chan Zuckerberg Biohub Network. The CZ Biohub Network includes the San Francisco Chicago and New York Biohubs as well as the Chan Zuckerberg Imaging Institute. CZI also collaborates with institutional partners like the Kempner Institute for the Study of Natural & Artificial Intelligence at Harvard University. Join us in accelerating science.

Our Central Tech team provides technology and security support for CZI and our grantees. We believe that Engineering IT and Security are most effective when in sync and learning from each other on a daily basis. Across our three pillars of Infrastructure Security and Grantee & Partner Support we enable our teams to achieve their goals faster and more securely. We leverage technology to automate manual processes constantly innovate to optimize operations provide first-class support and build solutions to enable the scale and execution of our business partners strategies and initiatives.

The AI/ML Infrastructure team works on building shared tools and platforms to be used across the Chan Zuckerberg Initiative partnering and supporting the work of an extensive group of Research Scientists Data Scientists AI Research Scientists as well as a broad range of Engineers focusing on Education and Science domain problems. Members of the shared infrastructure engineering team have an impact on all of CZIs initiatives by enabling the technology solutions used by other engineering teams at CZI to scale.

The Opportunity

We are seeking a Senior Director of Engineering to lead our infrastructure organization spanning AI infrastructure engineering AI/ML operations data infrastructure cloud infrastructure and security engineering. This leader will drive strategy execution and innovation to support AI research web products and production workloads across hybrid environments (cloud and on-prem HPC). This team manages the largest cluster for scientific research in the world with more than 1300 GPUs (NVIDIA H100 and H200 GPUs) and enables scientific research and development of various biological models (like GREmLN TranscriptFormer) from vast biological datasets acquired through our BioHub labs partnerships and open science repositories. The role is highly cross-functional partnering closely with product research and operations teams to deliver scalable secure and high-performing systems.

What Youll Do

  • Define and execute the long-term vision and roadmap for AI data cloud and security infrastructure with clear metrics to measure progress and outcomes.
  • Oversee the design and operation of hybrid GPU compute clusters and ML platforms to support training fine-tuning and inference workloads.
  • Ensure robust scalable and efficient data infrastructure and cloud operations to power analytics ML pipelines and product needs.
  • Drive reliability observability and cost optimization across GPU based workloads for development training and inference.
  • Implement modern AI/ML Ops practices (orchestration of model training workloads reproducibility automated monitoring) to accelerate research and production workflows with a focus on continuous delivery and improvement.
  • Build mentor and scale high-performing multi-disciplinary engineering teams.
  • Partner with product research and executive leadership to align infrastructure with organizational priorities ensuring delivery is measured against agreed objectives and key results.
  • Establish policies for infrastructure usage prioritization and compliance with regulatory requirements.
  • Stay ahead of emerging technologies in AI infrastructure cloud and security; drive their strategic adoption.

What Youll Bring

  • 15 years in engineering with at least 7 years in senior leadership roles managing multi-disciplinary teams and organizations of 30 employees with experience leading and developing managers
  • Strong knowledge of AI/ML frameworks (e.g. PyTorch) and MLOps tools (e.g. Kubeflow MLflow Ray).
  • Experience managing both traditional cloud platforms (AWS GCP Azure) and AI cloud (HPC/GPU clusters).
  • Deep experience with large-scale data systems pipelines and storage technologies.
  • Track record of improving reliability observability and cost efficiency in large-scale systems.
  • Proven ability to define multi-year infrastructure strategies while delivering on immediate priorities.
  • Exceptional written and verbal communication skills capable of engaging technical and non-technical audiences.
  • Ability to provide clear leadership and momentum in an ambiguous environmentsetting direction aligning teams and turning uncertainty into forward progress.

Compensation

The Redwood City CA base pay range for a new hire in this role is $435000 - $621500. New hires are typically hired into the lower portion of the range enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience as evaluated throughout the interview process.

Pay ranges outside Redwood City are adjusted based on cost of labor in each respective geographical market. Your recruiter can share more about the specific pay range for your location during the hiring process.

Work Mode

As we grow were excited to strengthen in-person connections and cultivate a collaborative team-oriented environment. This role is a hybrid position requiring you to be onsite for at least 60% of the working month approximately 3 days a week with specific in-office days determined by the teams manager. The exact schedule will be at the hiring managers discretion and communicated during the interview process.

Benefits for the Whole You

Were thankful to have an incredible team behind our work. To honor their commitment we offer a wide range of benefits to support the people who make all we do possible.

  • Provides a generous employer match on employee 401(k) contributions to support planning for the future.
  • Paid time off to volunteer at an organization of your choice.
  • Funding for select family-forming benefits.
  • Relocation support for employees who need assistance moving

If youre interested in a role but your previous experience doesnt perfectly align with each qualification in the job description we still encourage you to apply as you may be the perfect fit for this or another role.

#LI-Hybrid


Required Experience:

Exec

The Chan Zuckerberg Initiative was founded in 2015 by Priscilla Chan and Mark Zuckerberg to help solve some of societys toughest challenges from curing or preventing disease to improving education and addressing the needs of our local communities. We provide the operational support across our areas...
View more view more

Key Skills

  • Go
  • Lean
  • Management Experience
  • React
  • Node.js
  • Operations Management
  • Project Management
  • Research & Development
  • Software Development
  • Team Management
  • GraphQL
  • Leadership Experience

About Company

Company Logo

The Chan Zuckerberg Initiative (CZI) is a new kind of philanthropy that’s on a mission to help build a more inclusive, just and healthy future for everyone.

View Profile View Profile