Senior Data Engineer – Azure Databricks

Not Interested
Bookmark
Report This Job

profile Job Location:

Seattle, OR - USA

profile Monthly Salary: Not Disclosed
Posted on: 7 hours ago
Vacancies: 1 Vacancy

Job Summary

Role Description / Essential Skills
Data Engineering Development
Design develop and optimize ETL/ELT data pipelines using Azure Databricks (Spark) and PySpark.
Build and maintain large-scale distributed data processing systems.
Work with structured semi-structured and unstructured data using Spark DataFrames Delta Lake and similar technologies.
Implement Delta Lake architecture to support ACID transactions time travel and scalable storage.
Azure Cloud Ecosystem
Develop solutions leveraging:
Azure Data Lake Storage (ADLS)
Azure Synapse
Azure Data Factory (ADF)
Azure Key Vault
Build automated data workflows using ADF pipelines and Databricks jobs.
Optimize compute clusters and manage Databricks workspaces for performance and cost efficiency.
Data Modeling & Architecture
Design and implement data models including star and snowflake schemas.
Develop optimal database structures to support analytics and reporting.
Ensure adherence to data governance data quality and security best practices.
Collaborate with architecture teams to define scalable and secure cloud data platforms.
Collaboration & Agile Delivery
Partner with data scientists business analysts and product teams to deliver high-quality reliable datasets.
Translate business requirements into detailed technical specifications.
Participate in code reviews mentor junior engineers and promote coding best practices.
Desirable Skills:
Skills: Digital : Microsoft AzureDigital : DatabricksDigital : PySpark
Experience Required: 8-10

Role Description / Essential Skills Data Engineering Development Design develop and optimize ETL/ELT data pipelines using Azure Databricks (Spark) and PySpark. Build and maintain large-scale distributed data processing systems. Work with structured semi-structured and unstructured data using Spa...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala