Principal Product Engineer, Cloud Platform

Verdigris

Job Location:

Palo Alto, CA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

GPU racks pull 120140 kW today. By 2027 that number hits 600 kW to 1 MW per rack. The entire AI buildout hundreds of billions in capex is being erected on a grid that was not designed for it. Design margins have compressed from 30% to 1015%. The monitoring systems built for the last generation of infrastructure poll at one-second intervals. GPU workloads ramp in eight milliseconds.

AI is accelerating faster than the infrastructure beneath it can be understood.

The incumbent vendors Schneider Eaton Vertiv were built for a world where loads were predictable and slow. They are not broken. They are mismatched to what AI infrastructure demands. Verdigris captures continuous waveforms at 8 kHz. That is not a software improvement on existing monitoring data. It is a different measurement entirely one that makes visible what no other system can see: hidden degradation safe operating headroom and the real-time electrical behavior of infrastructure running at the edge of its design limits.

We are not a monitoring solution. We are the electrical intelligence layer the validation layer that sits between the physical environment and the autonomous control systems the industry is building toward. Solving this matters beyond the business case. Carbon-free AI stranded capacity recovery and the long-term reliability of the compute layer the world is betting on all depend on getting electrical intelligence right at the physical layer.

The company

Thirty people. Lean by design. We have raised serious capital refocused the company around the most consequential problem in AI infrastructure and come out the other side with real customers real revenue and hardware that has been running in colocation and owned data center facilities for more than a decade. The cloud platform processes billions of 8 kHz waveform readings and turns them into validated operating limits that operators use daily.

This unique positionbuilt on our high-fidelity 8 kHz meteringconverts the strain on electrical infrastructure into a definitive roadmap for solving the AI industrys most critical power bottleneck and driving the sectors next wave of technological improvement.

Today that means reliability and early warning. Tomorrow it means capacity optimization and machine-facing orchestration APIs that GPU schedulers consume directly.

The role

We are hiring a Principal Product Engineer to own the engineering side of the roadmap for the cloud platform the system that makes all three product pillars work: Observability Intelligence and Orchestration.

You report to the cofounder/CTO and partner with Product as a peer on what we build in what order and with what architecture. This is a senior individual-contributor player-coach role: no direct reports. You set the technical bar through what you ship and how you review not through formal management. You will ship code in production debug the hardest reliability and performance problems write the RFCs others reference and anchor the engineering operating cadence as a contributor. If you have not been in a codebase recently this is not the right fit.

We are building toward best-in-class industry standards: clear ownership a culture of high craft and senior engineering that accelerates the team through example rather than administration. The candidate we want believes in this velocity.

One more thing: a big part of how we operate is through deliberate opinionated use of agentic coding tools. The team is actively migrating towards an AI-native culture learning how to adopt practices that scale. You will be instrumental in defining the next standard for AI-native development here and you will hold the bar through your own work and through design and code reviews.

The situation

The platform works. Customers depend on it. The 8 kHz ingestion pipeline is real and running in production.

The platform is at a strategic inflection point: we must mature the architecture and organizational structure to support the scale and velocity of our next-generation product roadmap. We need someone who can take ownership of the platform drive engineering clarity across the surface and raise the quality bar while also building toward future application layers that do not exist yet.

First 6 months

Audit the platform: reliability scalability observability tech debt. Form your own view not just ours.
Organize ownership across the three-pillar stack. Ingestion and the 8 kHz pipeline. ML signal processing and validated operating limits. The APIs MCPs and workflows that deliver them. Own the engineering side of the roadmap what we build in what order with what architecture.
Anchor the engineering operating cadence as a contributor and bar-setter. Roadmap reviews incident reviews delivery planning architecture reviews.
Get your hands dirty on the hardest reliability and performance problems. Ship fixes not just plans.
Establish AI-native development practices on the team. Not a policy real tooling norms a shared view on where agentic coding accelerates and where it creates new risk.
Raise the hiring bar through interview rigor and design-review presence. Surface gaps you see in the team.

By 12 months here is what success looks like

Platform reliability and deployment velocity are measurably better. Fewer fires faster fixes.
The customer product surface ships predictably. Engineering decisions on your surface dont bottleneck on the CTO.
There is an engineering roadmap people trust one that connects todays reliability work to the capacity optimization and orchestration capabilities we are building toward.
Youve helped land 1-2 hires through your interview rigor.
We are capitalizing on well-architected foundations enabling us to move up the value delivery chain with our customers through a suite of well thought-through applications.
The platform is positioned to support machine-facing orchestration APIs: the layer where validated intelligence feeds directly into GPU schedulers and demand response systems.

What we are looking for

Real technical depth in cloud infrastructure data systems or ML platforms. You can review architecture debug production and make tradeoffs not just delegate them.
Youve owned a product surface end-to-end at a senior IC level. You set the technical bar by what you ship and how you review and you raise the team around you without formal management.
You can operate without a clean roadmap. You turn ambiguity into a plan with owners and timelines.
You care about production quality. Observability incident response release discipline. You build the habits not just the systems.
You have strong opinions about how agentic coding tools change what a small team can build. You are actively shaping how your team works with AI and you have the judgment to know where it helps and where it introduces new failure modes.
You are pulled by the mission. AI infrastructure is being built on a foundation that was not designed for it. Verdigris is the layer that makes it trustworthy. That framing should feel meaningful to you not just interesting.
You partner with Product as a peer. You translate customer escalations analytics signals and operator workflows into a build plan with owners and timelines.

Why this role

You would work directly with the founding team and own the platform that makes the product work.
The company is small enough that your decisions show up in the product and the culture within months. A lean team operating with the right practices and the right people can build like a team ten times its size. You will define what that looks like here.
The 8 kHz ingestion pipeline is already running in production. You are not starting from zero. You are taking something real and making it significantly better on infrastructure that actually matters.
If you are at a bigger company wondering whether you will ever get to build something from a position of real ownership this is that role.

We may use artificial intelligence (AI) tools to support parts of the hiring process such as reviewing applications analyzing resumes or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed please contact us.

Required Experience:

Staff IC

AI is accelerating faster than the infrastructure beneath it can be understood.

The company

Today that means reliability and early warning. Tomorrow it means capacity optimization and machine-facing orchestration APIs that GPU schedulers consume directly.

The role

The situation

The platform works. Customers depend on it. The 8 kHz ingestion pipeline is real and running in production.

First 6 months

Audit the platform: reliability scalability observability tech debt. Form your own view not just ours.
Organize ownership across the three-pillar stack. Ingestion and the 8 kHz pipeline. ML signal processing and validated operating limits. The APIs MCPs and workflows that deliver them. Own the engineering side of the roadmap what we build in what order with what architecture.
Anchor the engineering operating cadence as a contributor and bar-setter. Roadmap reviews incident reviews delivery planning architecture reviews.
Get your hands dirty on the hardest reliability and performance problems. Ship fixes not just plans.
Establish AI-native development practices on the team. Not a policy real tooling norms a shared view on where agentic coding accelerates and where it creates new risk.
Raise the hiring bar through interview rigor and design-review presence. Surface gaps you see in the team.

By 12 months here is what success looks like

Platform reliability and deployment velocity are measurably better. Fewer fires faster fixes.
The customer product surface ships predictably. Engineering decisions on your surface dont bottleneck on the CTO.
There is an engineering roadmap people trust one that connects todays reliability work to the capacity optimization and orchestration capabilities we are building toward.
Youve helped land 1-2 hires through your interview rigor.
We are capitalizing on well-architected foundations enabling us to move up the value delivery chain with our customers through a suite of well thought-through applications.
The platform is positioned to support machine-facing orchestration APIs: the layer where validated intelligence feeds directly into GPU schedulers and demand response systems.

What we are looking for

Real technical depth in cloud infrastructure data systems or ML platforms. You can review architecture debug production and make tradeoffs not just delegate them.
Youve owned a product surface end-to-end at a senior IC level. You set the technical bar by what you ship and how you review and you raise the team around you without formal management.
You can operate without a clean roadmap. You turn ambiguity into a plan with owners and timelines.
You care about production quality. Observability incident response release discipline. You build the habits not just the systems.
You have strong opinions about how agentic coding tools change what a small team can build. You are actively shaping how your team works with AI and you have the judgment to know where it helps and where it introduces new failure modes.
You are pulled by the mission. AI infrastructure is being built on a foundation that was not designed for it. Verdigris is the layer that makes it trustworthy. That framing should feel meaningful to you not just interesting.
You partner with Product as a peer. You translate customer escalations analytics signals and operator workflows into a build plan with owners and timelines.

Why this role

You would work directly with the founding team and own the platform that makes the product work.
The company is small enough that your decisions show up in the product and the culture within months. A lean team operating with the right practices and the right people can build like a team ten times its size. You will define what that looks like here.
The 8 kHz ingestion pipeline is already running in production. You are not starting from zero. You are taking something real and making it significantly better on infrastructure that actually matters.
If you are at a bigger company wondering whether you will ever get to build something from a position of real ownership this is that role.

Required Experience:

Staff IC

Apply Now

About Company

Verdigris

Verdigris enables smart buildings through AI and proprietary real-time energy monitoring hardware. Verdigris delivers insights on energy usage per device when its critical to see it.

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Principal Product Engineer, Cloud Platform

Palo Alto, CA - USA

Job Summary

First 6 months

By 12 months here is what success looks like

What we are looking for

Why this role

First 6 months

By 12 months here is what success looks like

What we are looking for

Why this role

About Company

Related Jobs