drjobs Senior Software Engineer - Distributed Data Systems

Senior Software Engineer - Distributed Data Systems

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

San Francisco, CA - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

P59

At Databricks we are passionate about enabling data teams to solve the worlds toughest problems from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. We do this by building and running the worlds best data and AI infrastructure platform so our customers can use deep data insights to improve their business. Founded by engineers and customer obsessed we leap at every opportunity to solve technical challenges from designing nextgen UI/UX for interfacing with data to scaling our services and infrastructure across millions of virtual machines. And were only getting started.

Modern data analysis employs sophisticated methods such as machine learning that go well beyond the rollup and drilldown capabilities of traditional SQL query engines. As a software engineer on the Runtime team at Databricks you will be building the next generation distributed data storage and processing systems that can outperform specialized SQL query engines in relational query performance yet provide the expressiveness and programming abstractions to support diverse workloads ranging from ETL to data science.

Below are some example projects:

Apache Spark: Develop the de facto open source standard framework for big data.

Data Plane Storage: Provide reliable and high performance services and client libraries for storing and accessing humongous amount of data on cloud storage backends e.g. AWS S3 Azure Blob Store.

Delta Lake: A storage management system that combines the scale and costefficiency of data lakes the performance and reliability of a data warehouse and the low latency of streaming. Its higher level abstractions and guarantees including ACID transactions and time travel drastically simplify the complexity of realworld data engineering architecture.

Delta Pipelines: Its difficult to manage even a single data engineering pipeline. The goal of the Delta Pipelines project is to make it simple and possible to orchestrate and operate tens of thousands of data pipelines. It provides a higher level abstraction for expressing data pipelines and enables customers to deploy test & upgrade pipelines and eliminate operational burdens for managing and building high quality data pipelines.

Performance Engineering: Build the next generation query optimizer and engine thats fast tuning free scalable and robust.

What we look for:

  • BS (or higher) in Computer Science related technical field or equivalent practical experience.
  • Comfortable working towards a multiyear vision with incremental deliverables.
  • Motivated by delivering customer value and impact.
  • 5 years of production level experience in either Java Scala or C.
  • Strong foundation in algorithms and data structures and their realworld use cases.
  • Experience with distributed systems databases and big data systems (Apache Spark Hadoop).


Required Experience:

Senior IC

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.