Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailNot Disclosed
Salary Not Disclosed
1 Vacancy
RDQ326R93
Databricks is radically simplifying the entire data lifecycle from ingestion to generative AI and everything inbetween. Were doing it crosscloud with a unified platform serving over 10k customers processing exabytes of data/day on 15 million VMs and growing exponentially.
The Lakeflow team is looking for recent PhD graduates. Lakeflow team includes products like Apache Spark Structured Streaming Delta Live Tables (DLT) and Materialized Views. Apache Spark Structured Streaming is one the worlds most popular streaming engines. DLT makes it easy to build and manage reliable batch and streaming data pipelines that deliver highquality data on the Databricks Lakehouse Platform. DLT helps data engineering teams simplify ETL (extracttransformload) development and management with declarative pipeline development automatic data testing and deep visibility for monitoring and recovery. DLT optimizes pipeline execution by logical optimization through query transformations and physical optimization such as instance type selection and vertical/horizontal autoscaling.
Moreover as part of DLT we have a new catalyst optimization layer Eenzyme designed specifically to speed up the ETL process and make declarative ETL computation possible by incrementally computing and materializing the intermediate results. Enzyme can create and keep uptodate a materialization of the results of a given query stored in a Delta table. Enzyme does this by using a cost model to choose between a variety of techniques that borrow from traditional literature on the maintenance of materialized views deltatodelta streaming and manual ETL patterns commonly used by our customers.
As a part of the LakeflowDLT team there are opportunities to design and implement in many areas that leapfrog existing systems:
What We Look For:
Databricks 2017. All rights reserved. Apache Apache Spark Spark and the Spark logo are trademarks of the Apache Software Foundation. Privacy Policy Terms of Use
Unclear