Director of Engineering (Data Infrastructure)

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

(P-1384)

Databricks processes petabytes of data and billions of transaction events daily - every cluster launch every query executed every dollar billed flows through infrastructure that must never fail. When we process billions in billing transactions with 99.999% accuracy requirements when we ingest terabytes per second across 100 regions when a five-minute outage costs millions in revenue and customer trust - infrastructure isnt just important its existential. The next phase of our growth demands disaster recovery systems that prove reliability rather than hope for it testing frameworks that catch production-scale problems before deployment correctness guarantees that make billing errors structurally impossible and automation that scales operations sublinearly with growth.

In this leadership opportunity you will build the data infrastructure organization that makes Databricks continued growth possible. Youll establish foundational teams in Bengaluru owning the bedrock systems that guarantee billing correctness operational resilience and zero-downtime recovery across our entire monetization stack alongside multi-region data ingestion developer platforms and deployment automation that eliminate friction at petabyte scale. This isnt about maintaining what exists; its about architecting the infrastructure that enables Databricks to scale while reducing operational burden. Youll define what world-class infrastructure looks like for the next decade of data platforms.

You will pursue these challenges as a founding technical leader in our fastest-growing engineering hub and strategic partner to global infrastructure addition to building world-class teams you will shape architectural decisions that ripple across the company and champion infrastructure-as-product thinking that transforms infrastructure into force multipliers globally. Youll work in an engineering culture born from Apache Spark and open source where technical depth matters and infrastructure engineers are celebrated as craftspeople.

The perfect candidate has built infrastructure organizations at companies where five nines werent simply aspirational where petabyte-scale wasnt marketing but Monday and where the infrastructure teams technical leverage determined whether the business could scale or stall. You have the technical depth to debate data architecture the strategic vision to define multi-year platform roadmaps the leadership craft to build teams that top engineers want to join and most importantly the conviction that data infrastructure done right doesnt just support the business; it defines whats possible.

The impact youll have:

Deliver the infrastructure vision for systems processing billions in daily billing transactions with zero tolerance for error building disaster recovery thats provably reliable testing frameworks that catch what production sees correctness systems that make billing errors structurally impossible and observability that predicts failures before they happen
Build Bengalurus data infrastructure organization by establishing it as the destination for Indias top infrastructure talent hiring multiple engineering managers who become force multipliers and creating a culture where solving hard distributed systems problems at scale is the daily work
Own business-critical systems operating 24/7/365 across 100 regions where even 99.9% uptime means hours of customer pain driving reliability improvements that prevent millions in revenue loss while eliminating operational toil through frameworks that make systems self-healing self-tuning and self-documenting
Ship platforms that compound engineering leverage across Databricks: correctness frameworks that catch billing errors before customers do deployment automation that makes regional expansion push-button data integration systems that process petabyte-scale flows without human intervention and testing infrastructure where comprehensive coverage is automatic not heroic
Position infrastructure as product by treating internal engineering teams as customers with SLAs measuring adoption and satisfaction iterating based on feedback and demonstrating that every dollar invested in infrastructure returns multiplicative gains in product velocity reliability improvements or cost reductions

What youll need:

14 years in distributed systems engineering with 6 years leading infrastructure organizations and 4 years managing managers at companies where infrastructure failures meant immediate revenue impact customer escalations or regulatory consequences - and you built the systems and teams that made those failures rare
Technical depth across petabyte-scale data pipelines and distributed systems reliability where you can engage from how should we architect multi-region disaster recovery to why is this Kafka cluster exhibiting this latency pattern while knowing when to coach versus when to decide
Track record defining multi-year infrastructure vision and translating it into sequential deliverables that show value quarterly while building toward architectural end states positioning infrastructure investments as business enablers rather than cost centers and making build-vs-buy decisions that compound over time
Experience building 99.999% reliable systems with established practices for SLOs/SLIs chaos engineering disaster recovery and sophisticated observability that predicts failures before they happen
Proven ability to scale infrastructure organizations in high-growth environments where youve doubled engineering while maintaining quality bar developed engineering managers and created teams where retention is high because the problems are interesting and the culture is strong
Communication skills to make complex infrastructure decisions legible to executives (translating technical investments into business outcomes) influence cross-functional partners without authority build trust across global teams in different timezones with different working styles and represent Databricks technical brand externally
BS in Computer Science or Engineering; MS or Ph.D. preferred. Experience with Apache Spark Delta Lake large-scale data infrastructure fintech/billing systems or leading infrastructure through hypergrowth strongly preferred

Required Experience:

Director

(P-1384)Databricks processes petabytes of data and billions of transaction events daily - every cluster launch every query executed every dollar billed flows through infrastructure that must never fail. When we process billions in billing transactions with 99.999% accuracy requirements when we inges...

(P-1384)

The impact youll have:

Deliver the infrastructure vision for systems processing billions in daily billing transactions with zero tolerance for error building disaster recovery thats provably reliable testing frameworks that catch what production sees correctness systems that make billing errors structurally impossible and observability that predicts failures before they happen
Build Bengalurus data infrastructure organization by establishing it as the destination for Indias top infrastructure talent hiring multiple engineering managers who become force multipliers and creating a culture where solving hard distributed systems problems at scale is the daily work
Own business-critical systems operating 24/7/365 across 100 regions where even 99.9% uptime means hours of customer pain driving reliability improvements that prevent millions in revenue loss while eliminating operational toil through frameworks that make systems self-healing self-tuning and self-documenting
Ship platforms that compound engineering leverage across Databricks: correctness frameworks that catch billing errors before customers do deployment automation that makes regional expansion push-button data integration systems that process petabyte-scale flows without human intervention and testing infrastructure where comprehensive coverage is automatic not heroic
Position infrastructure as product by treating internal engineering teams as customers with SLAs measuring adoption and satisfaction iterating based on feedback and demonstrating that every dollar invested in infrastructure returns multiplicative gains in product velocity reliability improvements or cost reductions

What youll need:

14 years in distributed systems engineering with 6 years leading infrastructure organizations and 4 years managing managers at companies where infrastructure failures meant immediate revenue impact customer escalations or regulatory consequences - and you built the systems and teams that made those failures rare
Technical depth across petabyte-scale data pipelines and distributed systems reliability where you can engage from how should we architect multi-region disaster recovery to why is this Kafka cluster exhibiting this latency pattern while knowing when to coach versus when to decide
Track record defining multi-year infrastructure vision and translating it into sequential deliverables that show value quarterly while building toward architectural end states positioning infrastructure investments as business enablers rather than cost centers and making build-vs-buy decisions that compound over time
Experience building 99.999% reliable systems with established practices for SLOs/SLIs chaos engineering disaster recovery and sophisticated observability that predicts failures before they happen
Proven ability to scale infrastructure organizations in high-growth environments where youve doubled engineering while maintaining quality bar developed engineering managers and created teams where retention is high because the problems are interesting and the culture is strong
Communication skills to make complex infrastructure decisions legible to executives (translating technical investments into business outcomes) influence cross-functional partners without authority build trust across global teams in different timezones with different working styles and represent Databricks technical brand externally
BS in Computer Science or Engineering; MS or Ph.D. preferred. Experience with Apache Spark Delta Lake large-scale data infrastructure fintech/billing systems or leading infrastructure through hypergrowth strongly preferred

Required Experience:

Director