Job Description:
We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design build and optimize large-scale data pipelines on the Databricks Lakehouse platform while implementing robust CI/CD and deployment practices. This role requires strong skills in PySpark SQL Azure cloud services and modern DevOps tooling. You will collaborate with cross-functional teams to deliver scalable secure and high performance data solutions.
Key Responsibilities:
1. Data Pipeline Development
Design build and maintain scalable ETL/ELT pipelines using Databricks.
Develop data processing workflows using PySpark/Spark and SQL for large volume datasets.
Integrate data from ADLS Azure Blob Storage and relational/non-relational data sources.
Implement Delta Lake best practices including schema evolution ACID transactions OPTIMIZE ZORDER and performance tuning.
2. DevOps & CI/CD
Implement CI/CD pipelines for Databricks using Git GitLab Azure DevOps or similar tools.
Build and manage automated deployments using Databricks Asset Bundles.
Manage version control for notebooks workflows libraries and configuration artifacts.
Automate cluster configuration job creation and environment provisioning.
3. Collaboration & Business Support
Work with data analysts and BI teams to prepare datasets for reporting and dashboarding.
Collaborate with product owners business partners and engineering teams to translate requirements into scalable data solutions.
Document data flows architecture and deployment processes.
4. Performance & Optimization
Tune Databricks clusters jobs and pipelines for cost efficiency and high performance.
Monitor workflows debug failures and ensure pipeline stability and reliability.
Implement job instrumentation and observability using logging/monitoring tools.
5. Governance & Security
Implement and manage data governance using Unity Catalog.
Enforce access controls data security and compliance with enterprise policies.
Ensure best practices around data quality lineage and auditability.
Technical Skills:
Strong hands-on experience with Databricks including:
o Delta Lake
o Unity Catalog
o Lakehouse Architecture
o Delta Live Pipelines
o Databricks Runtime
o Table Triggers
Proficiency in PySpark Spark and advanced SQL.
Expertise with Azure cloud services (ADLS ADF Key Vault Functions etc.).
Experience with relational databases and data warehousing concepts.
Strong understanding of DevOps tools:
o Git/GitLab
o CI/CD pipelines
o Databricks Asset Bundles
Familiarity with infrastructure-as-code (Terraform is a plus).
Preferred Experience:
Knowledge of streaming technologies like Structured Streaming or Spark Streaming.
Experience building real-time or near real-time pipelines.
Exposure to advanced Databricks runtime configurations and tuning.
Certifications (Optional):
Databricks Certified Data Engineer Associate / Professional
Azure Data Engineer Associate
Job Description: We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design build and optimize large-scale data pipelines on the Databricks Lakehouse platform while implementing robust CI/CD and de...
Job Description:
We are looking for an experienced Databricks Data Engineer with strong DevOps expertise to join our data engineering team. The ideal candidate will design build and optimize large-scale data pipelines on the Databricks Lakehouse platform while implementing robust CI/CD and deployment practices. This role requires strong skills in PySpark SQL Azure cloud services and modern DevOps tooling. You will collaborate with cross-functional teams to deliver scalable secure and high performance data solutions.
Key Responsibilities:
1. Data Pipeline Development
Design build and maintain scalable ETL/ELT pipelines using Databricks.
Develop data processing workflows using PySpark/Spark and SQL for large volume datasets.
Integrate data from ADLS Azure Blob Storage and relational/non-relational data sources.
Implement Delta Lake best practices including schema evolution ACID transactions OPTIMIZE ZORDER and performance tuning.
2. DevOps & CI/CD
Implement CI/CD pipelines for Databricks using Git GitLab Azure DevOps or similar tools.
Build and manage automated deployments using Databricks Asset Bundles.
Manage version control for notebooks workflows libraries and configuration artifacts.
Automate cluster configuration job creation and environment provisioning.
3. Collaboration & Business Support
Work with data analysts and BI teams to prepare datasets for reporting and dashboarding.
Collaborate with product owners business partners and engineering teams to translate requirements into scalable data solutions.
Document data flows architecture and deployment processes.
4. Performance & Optimization
Tune Databricks clusters jobs and pipelines for cost efficiency and high performance.
Monitor workflows debug failures and ensure pipeline stability and reliability.
Implement job instrumentation and observability using logging/monitoring tools.
5. Governance & Security
Implement and manage data governance using Unity Catalog.
Enforce access controls data security and compliance with enterprise policies.
Ensure best practices around data quality lineage and auditability.
Technical Skills:
Strong hands-on experience with Databricks including:
o Delta Lake
o Unity Catalog
o Lakehouse Architecture
o Delta Live Pipelines
o Databricks Runtime
o Table Triggers
Proficiency in PySpark Spark and advanced SQL.
Expertise with Azure cloud services (ADLS ADF Key Vault Functions etc.).
Experience with relational databases and data warehousing concepts.
Strong understanding of DevOps tools:
o Git/GitLab
o CI/CD pipelines
o Databricks Asset Bundles
Familiarity with infrastructure-as-code (Terraform is a plus).
Preferred Experience:
Knowledge of streaming technologies like Structured Streaming or Spark Streaming.
Experience building real-time or near real-time pipelines.
Exposure to advanced Databricks runtime configurations and tuning.
Certifications (Optional):
Databricks Certified Data Engineer Associate / Professional
Azure Data Engineer Associate
View more
View less