P-1489
About Databricks
At Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating medical breakthroughs. We do this by building and operating the worlds best data and AI infrastructure platform so our customers can turn deep data insights into real business impact. Founded by engineers and deeply customer-obsessed we thrive on solving hard technical challenges from next-generation data experiences to operating infrastructure at massive global scale. And were only getting started. For more information visit .
The Role
Databricks is looking for a Staff Technical Program Manager to drive GenAI Operations and Capacity Planning for our large-scale LLM and GPU-backed platform. This role is designed for a senior hands-on TPM who thrives in technically deep data-driven environments and enjoys owning complex operational programs end to end.
As a Staff TPM you will own execution for critical GenAI operational initiatives operate with significant autonomy and partner closely with AI/ML engineering infrastructure finance partner ops and cloud/LLM providers. You will use strong analytical skills to guide decisions surface risks and continuously improve how Databricks launches scales and governs GenAI workloads.
You will report to a Technical Program Leader and operate across multiple time zones in a fast-moving highly ambiguous environment.
What Youll Do
GenAI & LLM Operations
- Plan and execute day-0 launches of new LLM models on Databricks ensuring production readiness across engineeringcommercializationgo-to-market legal and cloud service partners
- Partner with AI/ML and platform engineering teams to operationalize LLM onboarding rollout and lifecycle management.
- Define and maintain launch checklists operational runbooks and success metrics for GenAI workloads.
GPU & LLM Capacity Planning
- Own GPU and LLM capacity planning forecasting and allocation for GenAI workloads.
- Build and maintain SQL-driven analytical models and dashboards to forecast demand track utilization and surface capacity risks.
- Balance customer demand growth trajectories and contractual commitments to inform short- and medium-term capacity decisions.
Utilization Efficiency & Analytics
- Track and drive efficient consumption of GPU and LLM capacity identifying underutilization contention and inefficiencies.
- Define and monitor KPIs for utilization efficiency and reliability of GenAI platforms.
- Use data to recommend improvements to engineering roadmaps operational processes and cost optimization efforts.
Governance Controls & Reporting
- Execute governance mechanisms to ensure GenAI capacity usage aligns with contractual financial and compliance requirements.
- Produce clear data-backed reporting for senior leaders on capacity health utilization trends and operational risks.
- Generate consumption reports usage metrics reporting and share of wallet attestations
- Ensure documentation controls and processes are audit-ready and consistently followed.
What We Look For
Minimum Qualifications
- 10 years of overall industry experience including 7 years in Technical Program Management.
- Experience leading cross-functional GenAI AI/ML or infrastructure programs from planning through launch and steady-state operations.
- Strong background in capacity planning forecasting and infrastructure analytics.
- Advanced SQL skills and hands-on experience building analytics dashboards and operational reporting.
- Ability to translate complex data into clear insights and recommendations for engineering and leadership stakeholders.
- Hands-on experience with at least one major cloud provider: AWS Azure or GCP.
- Familiarity with agile methodologies and program management tools such as Jira.
- Comfortable managing ambiguity driving execution and handling escalations when needed.
Preferred Qualifications
- Masters degree or advanced technical degree.
- Experience operating LLM GPU or GenAI platforms in production environments.
- Background in cloud infrastructure distributed systems or platform engineering.
- Previous software or hardware development experience.
About Databricks
Databricks is the data and AI company. More than 10000 organizations worldwide including Comcast Condé Nast Grammarly and over 50% of the Fortune 500 rely on the Databricks Data Intelligence Platform to unify and democratize data analytics and AI. Databricks is headquartered in San Francisco with offices around the globe and was founded by the original creators of Lakehouse Apache Spark Delta Lake and MLflow. Follow Databricks on Twitter LinkedIn and Facebook to learn more.
Benefits
At Databricks we strive to provide comprehensive benefits and perks that meet the needs of all our employees. For specific details on the benefits offered in your region please visit Commitment to Diversity and Inclusion
At Databricks we are committed to fostering a diverse and inclusive culture where everyone can excel. We ensure our hiring practices meet equal employment opportunity standards and consider candidates without regard to age color disability ethnicity family or marital status gender identity or expression language national origin physical or mental ability political affiliation race religion sexual orientation socio-economic status veteran status or other protected characteristics.
Compliance
If access to export-controlled technology or source code is required for performance of job duties it is within the Employers discretion whether to apply for a U.S. government license for such positions and Employer may decline to proceed with an applicant on this basis alone.
Pay Range Transparency
Databricks is committed to fair compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate including but not limited to job-related skills depth of experience relevant certifications and training and specific work location. Based on the factors above Databricks uses the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus equity and the benefits listed above. For more information about which range your location is in visit our page here.
Required Experience:
Manager
P-1489About DatabricksAt Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating medical breakthroughs. We do this by building and operating the worlds best data and AI infrastructure platform...
P-1489
About Databricks
At Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating medical breakthroughs. We do this by building and operating the worlds best data and AI infrastructure platform so our customers can turn deep data insights into real business impact. Founded by engineers and deeply customer-obsessed we thrive on solving hard technical challenges from next-generation data experiences to operating infrastructure at massive global scale. And were only getting started. For more information visit .
The Role
Databricks is looking for a Staff Technical Program Manager to drive GenAI Operations and Capacity Planning for our large-scale LLM and GPU-backed platform. This role is designed for a senior hands-on TPM who thrives in technically deep data-driven environments and enjoys owning complex operational programs end to end.
As a Staff TPM you will own execution for critical GenAI operational initiatives operate with significant autonomy and partner closely with AI/ML engineering infrastructure finance partner ops and cloud/LLM providers. You will use strong analytical skills to guide decisions surface risks and continuously improve how Databricks launches scales and governs GenAI workloads.
You will report to a Technical Program Leader and operate across multiple time zones in a fast-moving highly ambiguous environment.
What Youll Do
GenAI & LLM Operations
- Plan and execute day-0 launches of new LLM models on Databricks ensuring production readiness across engineeringcommercializationgo-to-market legal and cloud service partners
- Partner with AI/ML and platform engineering teams to operationalize LLM onboarding rollout and lifecycle management.
- Define and maintain launch checklists operational runbooks and success metrics for GenAI workloads.
GPU & LLM Capacity Planning
- Own GPU and LLM capacity planning forecasting and allocation for GenAI workloads.
- Build and maintain SQL-driven analytical models and dashboards to forecast demand track utilization and surface capacity risks.
- Balance customer demand growth trajectories and contractual commitments to inform short- and medium-term capacity decisions.
Utilization Efficiency & Analytics
- Track and drive efficient consumption of GPU and LLM capacity identifying underutilization contention and inefficiencies.
- Define and monitor KPIs for utilization efficiency and reliability of GenAI platforms.
- Use data to recommend improvements to engineering roadmaps operational processes and cost optimization efforts.
Governance Controls & Reporting
- Execute governance mechanisms to ensure GenAI capacity usage aligns with contractual financial and compliance requirements.
- Produce clear data-backed reporting for senior leaders on capacity health utilization trends and operational risks.
- Generate consumption reports usage metrics reporting and share of wallet attestations
- Ensure documentation controls and processes are audit-ready and consistently followed.
What We Look For
Minimum Qualifications
- 10 years of overall industry experience including 7 years in Technical Program Management.
- Experience leading cross-functional GenAI AI/ML or infrastructure programs from planning through launch and steady-state operations.
- Strong background in capacity planning forecasting and infrastructure analytics.
- Advanced SQL skills and hands-on experience building analytics dashboards and operational reporting.
- Ability to translate complex data into clear insights and recommendations for engineering and leadership stakeholders.
- Hands-on experience with at least one major cloud provider: AWS Azure or GCP.
- Familiarity with agile methodologies and program management tools such as Jira.
- Comfortable managing ambiguity driving execution and handling escalations when needed.
Preferred Qualifications
- Masters degree or advanced technical degree.
- Experience operating LLM GPU or GenAI platforms in production environments.
- Background in cloud infrastructure distributed systems or platform engineering.
- Previous software or hardware development experience.
About Databricks
Databricks is the data and AI company. More than 10000 organizations worldwide including Comcast Condé Nast Grammarly and over 50% of the Fortune 500 rely on the Databricks Data Intelligence Platform to unify and democratize data analytics and AI. Databricks is headquartered in San Francisco with offices around the globe and was founded by the original creators of Lakehouse Apache Spark Delta Lake and MLflow. Follow Databricks on Twitter LinkedIn and Facebook to learn more.
Benefits
At Databricks we strive to provide comprehensive benefits and perks that meet the needs of all our employees. For specific details on the benefits offered in your region please visit Commitment to Diversity and Inclusion
At Databricks we are committed to fostering a diverse and inclusive culture where everyone can excel. We ensure our hiring practices meet equal employment opportunity standards and consider candidates without regard to age color disability ethnicity family or marital status gender identity or expression language national origin physical or mental ability political affiliation race religion sexual orientation socio-economic status veteran status or other protected characteristics.
Compliance
If access to export-controlled technology or source code is required for performance of job duties it is within the Employers discretion whether to apply for a U.S. government license for such positions and Employer may decline to proceed with an applicant on this basis alone.
Pay Range Transparency
Databricks is committed to fair compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate including but not limited to job-related skills depth of experience relevant certifications and training and specific work location. Based on the factors above Databricks uses the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus equity and the benefits listed above. For more information about which range your location is in visit our page here.
Required Experience:
Manager
View more
View less