Data Engineer

InFynd


Job Location:

Coimbatore - India

Monthly Salary: Not Disclosed
Experience Required: 5-10years
Posted on: 7 days ago
Vacancies: 1 Vacancy

Job Summary

  • Design and implement scalable batch and streaming data pipelines using Azure Databricks Delta Live Tables and Apache Spark.
  • Build and maintain ETL/ELT workflows orchestrated through Azure Data Factory and Databricks Workflows.
  • Develop reusable and modular pipeline components following software engineering best practices.
  • Architect and manage Lakehouse solutions using Delta Lake across Bronze Silver and Gold layers.
  • Design and enforce data models schemas and governance policies using Unity Catalog.
  • Optimize storage partitioning and query performance for large-scale datasets on ADLS Gen2.
  • Manage Databricks clusters compute policies and job scheduling.
  • Implement Infrastructure as Code (IaC) using Terraform or ARM templates.
  • Integrate Databricks with Azure services including Synapse Event Hubs Key Vault and Azure DevOps.


Requirements

  • Strong proficiency in PySpark Python and SQL for Big Data processing.
  • Hands-on experience with Delta Lake Delta Live Tables (DLT) and Medallion Architecture.
  • Strong experience with Azure Data Services including:
    • Azure Data Lake Storage Gen2 (ADLS Gen2)
    • Azure Data Factory (ADF)
    • Azure Synapse Analytics
    • Azure Event Hubs
  • Experience with Databricks Unity Catalog for data governance and access control.
  • Experience implementing CI/CD pipelines using Azure DevOps or GitHub Actions for Databricks deployments.
  • Strong understanding of distributed computing concepts Spark optimization partitioning and performance tuning.
  • Experience with streaming data processing using Structured Streaming Kafka or Event Hubs.



Required Skills:

Strong proficiency in PySpark Python and SQL for Big Data processing. Hands-on experience with Delta Lake Delta Live Tables (DLT) and Medallion Architecture. Strong experience with Azure Data Services including: Azure Data Lake Storage Gen2 (ADLS Gen2) Azure Data Factory (ADF) Azure Synapse Analytics Azure Event Hubs Experience with Databricks Unity Catalog for data governance and access control.

Design and implement scalable batch and streaming data pipelines using Azure Databricks Delta Live Tables and Apache Spark.Build and maintain ETL/ELT workflows orchestrated through Azure Data Factory and Databricks Workflows.Develop reusable and modular pipeline components following software enginee...