Senior Databricks Data Engineer

Inetum

Not Interested
Bookmark
Report This Job

profile Job Location:

Bucharest - Romania

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

 

To develop implement and optimize complex Data Warehouse (DWH) and Data Lakehouse solutions using the Databricks platform (including Delta Lake Unity Catalog and Spark) to ensure a scalable high-performance and governed data foundation for analytics reporting and Machine Learning.

Responsibilities

A. Databricks Development and Architecture

  • Advanced Design and Implementation: Design and implement robust scalable and high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform.
  • Delta Lake: Expertise in implementing and optimizing the Medallion architecture (Bronze Silver Gold) using Delta Lake to ensure data quality consistency and historical tracking.
  • Lakehouse Platform: Efficient implementation of the Lakehouse architecture on Databricks combining best practices from DWH and Data Lake.
  • Performance Optimization: Optimize Databricks clusters Spark operations and Delta tables (e.g. Z-ordering Compaction Tuning Queries) to reduce latency and computational costs.
  • Streaming: Design and implement real-time/near-real-time data processing solutions using Spark Structured Streaming and Delta Live Tables (DLT).

B. Governance and Security

  • Unity Catalog: Implement and manage Unity Catalog for centralized data governance fine-grained security (row/column-level security) and data lineage.
  • Data Quality: Define and implement data quality standards and rules (e.g. using DLT or Great Expectations) to maintain data integrity.

C. Operations and Collaboration

  • Orchestration: Develop and manage complex workflows using Databricks Workflows (Jobs) or external tools (e.g. Azure Data Factory Airflow) to automate pipelines.
  • DevOps/CI/CD: Integrate Databricks pipelines into CI/CD processes using tools like Git Databricks Repos and Bundles.
  • Collaboration: Work closely with Data Scientists Analysts and Architects to understand business requirements and deliver optimal technical solutions.
  • Mentorship: Provide technical guidance and mentorship to junior developers and promote best practices.

Qualifications :

A. Mandatory Knowledge (Expert Level)

  • Databricks Platform: Proven expert-level experience with the entire Databricks ecosystem (Workspace Cluster Management Notebooks Databricks SQL).
  • Apache Spark: In-depth knowledge of Spark architecture (RDD DataFrames Spark SQL) and advanced optimization techniques.
  • Delta Lake: Expertise in implementing and managing Delta Lake (ACID properties Time Travel Merge Optimize Vacuum).
  • Programming Languages: Advanced/expert-level proficiency in Python (with PySpark) and/or Scala (with Spark).
  • SQL: Advanced/expert-level skills in SQL and Data Modeling (Dimensional 3NF Data Vault).
  • Cloud: Solid experience with a major Cloud platform (AWS Azure or GCP) especially with storage services (S3 ADLS Gen2 GCS) and networking.

B. Additional Knowledge (Major Advantage)

  • Unity Catalog: Hands-on experience with implementing and managing Unity Catalog.
  • Lakeflow: Experience with Delta Live Tables (DLT) and Databricks Workflows.
  • ML/AI Concepts: Understanding of basic MLOps concepts and experience with MLflow to facilitate integration with Data Science teams.
  • DevOps: Experience with Terraform or equivalent tools for Infrastructure as Code (IaC).
  • Certifications: Databricks certifications (e.g. Databricks Certified Data Engineer Professional) are a significant advantage.

C. Education and Experience

  • Education: Bachelors degree in Computer Science Engineering Mathematics or a relevant technical field.
  • Professional Experience: Minimum of 5 years of experience in Data Engineering with at least 3 years of experience working with Databricks and Spark at scale.

Additional Information :

Benefits

  • Full access to foreign language learning platform
  • Personalized access to tech learning platforms
  • Tailored workshops and trainings to sustain your growth
  • Medical insurance
  • Meal tickets
  • Monthly budget to allocate on flexible benefit platform
  • Access to 7 Card services
  • Wellbeing activities and gatherings

Working model: hybrid - 2 days at the office


Remote Work :

Yes


Employment Type :

Full-time

 To develop implement and optimize complex Data Warehouse (DWH) and Data Lakehouse solutions using the Databricks platform (including Delta Lake Unity Catalog and Spark) to ensure a scalable high-performance and governed data foundation for analytics reporting and Machine Learning.ResponsibilitiesA....
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala

About Company

Company Logo

Inetum is a European leader in digital services. Inetum’s team of 28,000 consultants and specialists strive every day to make a digital impact for businesses, public sector entities and society. Inetum’s solutions aim at contributing to its clients’ performance and innovation as well ... View more

View Profile View Profile