Senior Software Engineer, Compute Infrastructure
San Francisco, CA - USA
Job Summary
RDQ427R175
At Databricks we are passionate about helping data teams solve the worlds toughest problems from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the worlds best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers and customer obsessed we leap at every opportunity to solve technical challenges from designing next-gen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And were only getting started.
At Databricks the Compute Infrastructure organization builds and operates the foundation that runs all Data AI and stateful workloads across all major clouds. Our platform launches tens of millions of VMs per day operates thousands of Kubernetes clusters and must deliver extreme elasticity reliability and cost efficiency.
As a Senior Software Engineer on the Compute Infra team you will design and build the systems that power Databricks compute infrastructure to enable engineers to quickly launch and scale world-class products.
The impact you will have:
- Develop the compute abstractions that provide powerful capabilities for all Databricks workloads enabling engineers to build world-class products with high velocity and best-in-class performance
- Design the workload orchestration and scheduling systems that orchestrates all types of workloads (serving batch stateful GPU) with high performance and efficiency
- Scale the fleet management systems that launch and configure millions of VMs every day across cloud providers
- Raise the technical and operational bar through strong design practices testing and a culture of engineering excellence and platform mindset.
- Lead cross-team initiatives that span product and infrastructure surface areas.
What we look for:
- BS (or higher) in Computer Science or related field
- 5 years of experience designing and building large-scale distributed systems
- Strong proficiency in one or more languages such as Java Scala Go or C
- Experience with service-oriented architectures and large scale distributed systems
- Familiarity with cloud platforms (AWS Azure GCP) and container/orchestration technologies (Kubernetes Docker)
- Track record of shipping infrastructure that supports mission-critical workloads at scale
Required Experience:
Senior IC
About Company
The Databricks Platform is the world’s first data intelligence platform powered by generative AI. Infuse AI into every facet of your business.