Senior Data Engineer - Databricks

Gurugram - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

About Us

Job Summary:

We are seeking a Senior Data Engineer Databricks with a strong development background in Azure Databricks and Python who will be instrumental in building and optimising scalable data pipelines and solutions across the Azure ecosystem. This role requires hands-on development experience with PySpark data modelling and Azure Data Factory. You will collaborate closely with data architects analysts and business stakeholders to ensure reliable and high-performance data solutions.

Experience Required: 4 Years

Senior Data Engineer (Microsoft Azure Databricks Data Factory Data Engineer Data Modelling)

Key Responsibilities:

Develop and Maintain Data Pipelines:
Design implement and optimise scalable data pipelines using Azure Databricks (PySpark) for both batch and streaming use cases.
Azure Platform Integration:
Work extensively with Azure services including Data Factory ADLS Gen2 Delta Lake and Azure Synapse for end-to-end data pipeline orchestration and storage.
Data Transformation & Processing:
Write efficient maintainable and reusable PySpark code for data ingestion transformation and validation processes within the Databricks environment.
Collaboration:
Partner with data architects analysts and data scientists to understand requirements and deliver robust high-quality data solutions.
Performance Tuning and Optimisation:
Optimise Databricks cluster configurations notebook performance and resource consumption to ensure cost-effective and efficient data processing.
Testing and Documentation:
Implement unit and integration tests for data pipelines. Document solutions processes and best practices to enable team growth and maintainability.
Security and Compliance:
Ensure data governance privacy and compliance are upheld across all engineered solutions following Azure security best practices.

Preferred Skills :

Strong hands-on experience with Delta Lake including table management schema evolution and implementing ACID-compliant pipelines.
Skilled in developing and maintaining Databricks notebooks and jobs for large-scale batch and streaming data processing.
Experience writing modular production-grade PySpark and Python code including reusable functions and libraries for data transformation.
Experience in streaming data ingestion and Structured Streaming in Databricks for near real-time data solutions.
Knowledge of performance tuning techniques in Spark including job optimization caching and partitioning strategies.
Exposure to data quality frameworks and testing practices (e.g. pytest data validation libraries custom assertions).
Basic understanding of Unity Catalog for managing data governance access controls and lineage tracking from a developers perspective.
Familiarity with Power BI - able to structure data models and views in Databricks or Synapse to support BI consumption.

Required Experience:

Senior IC

About UsJob Summary:We are seeking a Senior Data Engineer Databricks with a strong development background in Azure Databricks and Python who will be instrumental in building and optimising scalable data pipelines and solutions across the Azure ecosystem. This role requires hands-on development expe...