Job title- Senior Technical Program Manager
Location: Mountain View CA
Job type : Hybrid
Work timing: Mon-Fri 9am-5pm PST
Duration: 6 months
Summary
The R&D Operations Organization is seeking a Senior Technical Program Manager (TPM) with a background in distributed AI resource management forecasting capacity and strategic planning to join our team. This role involves supporting platform operations handling customer escalations and monitoring cluster health. Additionally youll ensure optimal compute resource allocation aligned with product sales and research priorities and drive decisions with data analysis and reporting in GenAI. This job is fast-paced cross-functional and requires strong communication prioritization and organization skills. It involves a deep understanding of stakeholder management and the ability to navigate an environment with passionate people intent on delivering valuable products.
Responsibilities:
-
Act as a single point of contact for escalations from sales and global support teams and help with various billing and support issues
-
Drive innovation and deliver high-quality products by ensuring that AI teams have the necessary GPU and resources.
-
Improve product margins by leading strategic initiatives to optimize GPU utilization and procurement.
-
Establish and maintain effective communication with technical and non-technical stakeholders and customers including regular project updates status reports and presentations.
-
Deliver step-level improvements with compute management efficiency and scalability by identifying and implementing process improvements.
-
Ensure strategic alignment across Sales Global Support and Engineering
Requirements:
-
6 years of professional experience with a Bachelors degree and related experience in technical program management distributed platforms resource management execution and strategic planning.
-
Proven track record of driving cross-functional teams to deliver complex technical projects on time and with high quality.
-
Excellent communication negotiation and analytical skills with the ability to document standard operating procedures and processes
-
Advanced working SQL Knowledge Ability to build and maintain analytics to track forecast and visualize consumption through ad-hoc SQL reports and dashboards
-
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
-
Self-motivated and able to work independently as well as in a team environment.
-
Preferred good working knowledge of GPU technology and its applications in generative AI and machine learning.
-
Familiarity with big data technologies such as Apache Spark Delta Lake and MLflow is a plus.
-
Experience with compute capacity management as well as financial analysis or sales/deal desk quoting is a plus.
Skills
-
Advanced Project Management
-
Advanced SQL
-
Cloud (aws/azure) knowledge
-
Technical Program Management
-
Dashboarding / BI skills
-
forecasting and capacity management experience
-
Python