Databricks Developer
Req number:
R6819
Employment type:
Full time
Worksite flexibility:
Hybrid
Who we are
CAI is a global technology services firm with over 8500 associates worldwide and a yearly revenue of $1 billion. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients colleagues and communities. As a privately held company we have the freedom and focus to do what is rightwhatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary
We are looking for a motivated Databricks Developer ready to take us to the next level! If you understand ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake and are looking forward to your next career move apply now
Job Description
We are looking for Databricks Developer.This position willbefull-timeand Hybrid from Bangalore.
What Youll Do
- Design & develop scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake.
- Build and maintain notebooks jobs workflows and Delta Live Tables (DLT) for batch/streaming ingestion.
- Optimize Spark performance (partitioning caching broadcast joins shuffle tuning) and cost management on clusters.
- Implement data quality checks (e.g. expectations unit tests) schema evolution CDC and medallion architecture (bronze/silver/gold).
- Collaborate with BI/Analytics to deliver curated datasets and semantic layers; document lineage and data contracts.
- Automate deployments via CI/CD.
- Secure data with Unity Catalog/ACLs rowlevel permissions and compliance best practices.
- Monitor & troubleshoot jobs using Databricks Job UI drive incident rootcause analysis.
- Provide tier 3 support in-group and out of group customers as well as make safe changes to production systems (OS database and application).
- Expertise in DB scripting on SQL or any DB.
- Stay up-to-date with the latest trends tools and techniques in data analytics big data and related technologies. Explore and evaluate emerging technologies and methodologies to drive innovation and improve data and analytics capabilities.
What Youll Need
Required:
- Strong PySpark/Spark SQL; solid understanding of distributed computing concepts (RDDs DataFrames partitions).
- Handson with Databricks clusters Jobs Repos Workflows Delta Lake and Unity Catalog (or equivalent).
- Experience with data modeling for analytics
- Cloud proficiency (Azure preferred: ADLS/ABFSS Key Vault Data Factory/Synapse; or AWS S3/Glue/MSK; or GCP).
- CI/CD for data workflows.
- Strong SQL and one scripting language (Python preferred).
- Familiarity with data quality frameworks (assertions unit/integration tests).
Good to have:
- Streaming (Structured Streaming Kafka/Event Hub) knowledge.
- Delta Live Tables (advanced) Photon runtime SQL Warehouses.
- Knowledge of cost optimization (cluster sizing autoscaling Spot/Lowpriority nodes).
- Exposure to ML pipelines (MLflow) for feature engineering (not mandatory).
- Experience integrating with Power BI/Tableau; semantic models; performance tuning.
Physical Demands
- Sedentary work that involves sitting or remaining stationery most of the time with occasional need to move around the office to attend meetings etc.
- Ability to conduct repetitive tasks on a computer utilizing a mouse keyboard and monitor.
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application interviewing completing any pre-employment testing or otherwise participating in the employment selection process please direct your inquiries to or (888).
Required Experience:
IC
Databricks DeveloperReq number:R6819Employment type:Full timeWorksite flexibility:HybridWho we areCAI is a global technology services firm with over 8500 associates worldwide and a yearly revenue of $1 billion. We have over 40 years of excellence in uniting talent and technology to power the possibl...
Databricks Developer
Req number:
R6819
Employment type:
Full time
Worksite flexibility:
Hybrid
Who we are
CAI is a global technology services firm with over 8500 associates worldwide and a yearly revenue of $1 billion. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients colleagues and communities. As a privately held company we have the freedom and focus to do what is rightwhatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary
We are looking for a motivated Databricks Developer ready to take us to the next level! If you understand ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake and are looking forward to your next career move apply now
Job Description
We are looking for Databricks Developer.This position willbefull-timeand Hybrid from Bangalore.
What Youll Do
- Design & develop scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake.
- Build and maintain notebooks jobs workflows and Delta Live Tables (DLT) for batch/streaming ingestion.
- Optimize Spark performance (partitioning caching broadcast joins shuffle tuning) and cost management on clusters.
- Implement data quality checks (e.g. expectations unit tests) schema evolution CDC and medallion architecture (bronze/silver/gold).
- Collaborate with BI/Analytics to deliver curated datasets and semantic layers; document lineage and data contracts.
- Automate deployments via CI/CD.
- Secure data with Unity Catalog/ACLs rowlevel permissions and compliance best practices.
- Monitor & troubleshoot jobs using Databricks Job UI drive incident rootcause analysis.
- Provide tier 3 support in-group and out of group customers as well as make safe changes to production systems (OS database and application).
- Expertise in DB scripting on SQL or any DB.
- Stay up-to-date with the latest trends tools and techniques in data analytics big data and related technologies. Explore and evaluate emerging technologies and methodologies to drive innovation and improve data and analytics capabilities.
What Youll Need
Required:
- Strong PySpark/Spark SQL; solid understanding of distributed computing concepts (RDDs DataFrames partitions).
- Handson with Databricks clusters Jobs Repos Workflows Delta Lake and Unity Catalog (or equivalent).
- Experience with data modeling for analytics
- Cloud proficiency (Azure preferred: ADLS/ABFSS Key Vault Data Factory/Synapse; or AWS S3/Glue/MSK; or GCP).
- CI/CD for data workflows.
- Strong SQL and one scripting language (Python preferred).
- Familiarity with data quality frameworks (assertions unit/integration tests).
Good to have:
- Streaming (Structured Streaming Kafka/Event Hub) knowledge.
- Delta Live Tables (advanced) Photon runtime SQL Warehouses.
- Knowledge of cost optimization (cluster sizing autoscaling Spot/Lowpriority nodes).
- Exposure to ML pipelines (MLflow) for feature engineering (not mandatory).
- Experience integrating with Power BI/Tableau; semantic models; performance tuning.
Physical Demands
- Sedentary work that involves sitting or remaining stationery most of the time with occasional need to move around the office to attend meetings etc.
- Ability to conduct repetitive tasks on a computer utilizing a mouse keyboard and monitor.
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application interviewing completing any pre-employment testing or otherwise participating in the employment selection process please direct your inquiries to or (888).
Required Experience:
IC
View more
View less