Senior Staff Software Engineer Lakeflow Pipelines Datasets
San Francisco, CA - USA
Job Summary
Databricks builds the worlds leading data and AI platform used by tens of thousands of organizations to turn raw data into decisions that matter.
The Lakeflow Engineering team owns the entirety of the ETL product line: Declarative Dataflow Graphs Materialized Views Structured Streaming and Flows. We run what is arguably the worlds largest data engineering platform processing exabytes of data daily across tens of thousands of customers. The problems we solve (fault-tolerant incremental computation stream-batch unification declarative pipeline orchestration at massive scale) sit at the frontier of what distributed data systems can do.
Were looking for a Senior Staff Engineer to serve as a technical architect and engineering leader for this platform. Youll own the long-term technical direction for how pipelines are defined optimized executed and operated shaping the systems that data teams worldwide depend on every day. You will work across one or more of the following areas to design and build the next generation of data infrastructure:
- Declarative Dataflow Processing
- Resource management
- Efficient storage structures
- Stream Processing and Complex Event Processing
- Autonomic Computing and Self-Managing Systems
The Impact you will have:
- Architect and deliver a highly scalable fault-tolerant platform for declarative pipelines and materialized views serving thousands of production customers.
- Turn ambiguous large-scale technical challenges into clear execution plans that ship incrementally and hold up at exabyte scale.
- Bridge the gap between research and production by bringing ideas from the academic literature or internal R&D into systems that solve real customer problems.
- Lead deep systems work: performance diagnosis on large production clusters low-level debugging and optimization that moves the needle on cost and latency.
- Shape product direction alongside engineering and product leadership owning technical strategy from design through delivery.
- Drive complex multi-team technical initiatives that require sustained coordination and clear technical judgment.
What Were Looking For:
- 10 years of experience building operating and evolving large-scale distributed systems in production.
- Deep expertise in one or more of: database internals storage systems distributed computing streaming systems language/API design or performance engineering.
- Track record of executing against a multi-year technical vision through well-sequenced incremental milestones.
- Strong algorithmic foundations with the practical judgment to know when they matter and when simpler solutions win.
- A bias toward customer impact over technical novelty for its own sake.
- Experience building alignment across teams and driving initiatives from conception through customer adoption.
- BS MS or PhD in Computer Science or a related field (or equivalent depth of experience).
Required Experience:
Staff IC
About Company
The Databricks Platform is the world’s first data intelligence platform powered by generative AI. Infuse AI into every facet of your business.