We are looking for a skilled Data Engineer with strong experience in Databricks PySpark SQL and Python. The ideal candidate will be responsible for building scalable data pipelines optimizing data processing workflows and enabling data-driven decision-making across the organization.
Key Responsibilities
Design develop and maintain scalable data pipelines using PySpark and Databricks
Develop and optimize complex SQL queries for data transformation and analysis
Build and manage ETL/ELT workflows for structured and unstructured data
Work with large-scale distributed data processing systems
Collaborate with data analysts data scientists and business stakeholders to understand data requirements
Ensure data quality integrity and consistency across pipelines
Implement performance tuning and optimization techniques in Databricks
Develop reusable and efficient Python-based data processing modules
Work on data ingestion from multiple sources (APIs databases files streaming etc.)
Maintain proper documentation and follow data engineering best practices
Required Skills
Strong experience in Python and PySpark
Hands-on experience with Databricks platform
Advanced knowledge of SQL
Experience in building and optimizing data pipelines and workflows
Good understanding of data modeling concepts (Star/Snowflake schemas)
Familiarity with big data technologies and distributed computing
Experience with data lakes / lakehouse architecture
Understanding of performance tuning and job optimization
Preferred Skills
Experience with cloud platforms (Azure / AWS / GCP)
Knowledge of Delta Lake
Exposure to CI/CD pipelines for data engineering
Experience with orchestration tools (Airflow ADF etc.)
Basic understanding of data governance and security
Soft Skills
Strong problem-solving and analytical thinking
Good communication and stakeholder management skills
Ability to work in a fast-paced collaborative environment
Nice to Have
Experience working in Agile/Scrum environments
Exposure to real-time/streaming data processing
Education
Bachelors/masters degree in computer science Engineering or related field
Required Experience:
Senior IC
Roles and Responsibilities mentioned below:Role OverviewWe are looking for a skilled Data Engineer with strong experience in Databricks PySpark SQL and Python. The ideal candidate will be responsible for building scalable data pipelines optimizing data processing workflows and enabling data-driven d...
Roles and Responsibilities mentioned below:
Role Overview
We are looking for a skilled Data Engineer with strong experience in Databricks PySpark SQL and Python. The ideal candidate will be responsible for building scalable data pipelines optimizing data processing workflows and enabling data-driven decision-making across the organization.
Key Responsibilities
Design develop and maintain scalable data pipelines using PySpark and Databricks
Develop and optimize complex SQL queries for data transformation and analysis
Build and manage ETL/ELT workflows for structured and unstructured data
Work with large-scale distributed data processing systems
Collaborate with data analysts data scientists and business stakeholders to understand data requirements
Ensure data quality integrity and consistency across pipelines
Implement performance tuning and optimization techniques in Databricks
Develop reusable and efficient Python-based data processing modules
Work on data ingestion from multiple sources (APIs databases files streaming etc.)
Maintain proper documentation and follow data engineering best practices
Required Skills
Strong experience in Python and PySpark
Hands-on experience with Databricks platform
Advanced knowledge of SQL
Experience in building and optimizing data pipelines and workflows
Good understanding of data modeling concepts (Star/Snowflake schemas)
Familiarity with big data technologies and distributed computing
Experience with data lakes / lakehouse architecture
Understanding of performance tuning and job optimization
Preferred Skills
Experience with cloud platforms (Azure / AWS / GCP)
Knowledge of Delta Lake
Exposure to CI/CD pipelines for data engineering
Experience with orchestration tools (Airflow ADF etc.)
Basic understanding of data governance and security
Soft Skills
Strong problem-solving and analytical thinking
Good communication and stakeholder management skills
Ability to work in a fast-paced collaborative environment
Nice to Have
Experience working in Agile/Scrum environments
Exposure to real-time/streaming data processing
Education
Bachelors/masters degree in computer science Engineering or related field