Data Engineer- Databricks

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Role: Databricks PySpark Developer
Experience: 5 years
Location: Bangalore (onsite-5days) /no relocation candidates
Notice period-immediate joiners/serving notice period

Role Overview :
We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation this role you will be responsible for designing developing and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers data scientists and BI teams to support advanced analytics and reporting requirements.

Key Responsibilities :
1. ETL Development & Data Engineering
Design develop and maintain scalable ETL processes using Databricks PySpark.
Extract transform and load data from heterogeneous sources into Data Lake and Data Warehouse environments.
Optimize ETL workflows for performance scalability and cost efficiency using Spark SQL and PySpark.
Implement robust error handling logging and monitoring mechanisms for ETL jobs.
Design and implement data solutions following Medallion Architecture (Bronze Silver Gold layers).
Ensure data is cleansed enriched validated and optimized at each layer for analytics consumption.

2. Data Pipeline Management
Hands-on experience in building and managing advanced data pipelines using Databricks Workflows.
Develop and maintain reliable reusable and scalable pipelines ensuring data quality and integrity.
Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.

3. Data Analysis & Query Optimization
Write review and optimize complex SQL queries for data transformation aggregation and analysis.
Perform query tuning and performance optimization on large-scale datasets within Databricks.

4. Project Coordination & Continuous Improvement
Participate in project planning estimation and delivery activities.
Stay updated with the latest features in Databricks Spark and cloud data platforms and recommend best practices.
Document ETL processes data lineage metadata and workflows to support data governance and compliance.
Mentor junior developers and contribute to team knowledge sharing where required.

Required Qualifications :
Bachelor’s degree in Computer Science Engineering or a related field.
5 years of experience in ETL/Data Engineering roles with strong focus on Databricks PySpark.
Strong proficiency in Python with hands-on experience in developing and debugging PySpark applications.
In-depth understanding of Apache Spark architecture including RDDs DataFrames and Spark SQL.
Expertise in SQL development and optimization for large-scale data processing.
Proven experience working with data warehousing concepts and ETL frameworks.
Strong problem-solving and troubleshooting skills.
Excellent communication and collaboration skills.

Preferred Qualifications :
Experience working on cloud platforms preferably AWS.
Hands-on experience with tools such as Databricks Snowflake Tableau or similar data platforms.
Strong understanding of data governance data quality and best practices in data engineering.
Relevant certifications in Databricks PySpark Spark SQL or cloud technologies.