We are looking for an experienced Senior Data Engineer with a strong foundation in Python SQL and Spark and handson expertise in AWS Databricks. In this role you will build and maintain scalable data pipelines and architecture to support analytics data science and business intelligence initiatives. Youll work closely with crossfunctional teams to drive data reliability quality and performance.
Responsibilities:
- Design develop and optimize scalable data pipelines using Databricks in AWS such as Glue S3 Lambda EMR Databricks notebooks workflows and jobs.
- Building data lake in WS Databricks.
- Build and maintain robust ETL/ELT workflows using Python and SQL to handle structured and semistructured data.
- Develop distributed data processing solutions using Apache Spark or PySpark.
- Partner with data scientists and analysts to provide highquality accessible and wellstructured data.
- Ensure data quality governance security and compliance across pipelines and data stores.
- Monitor troubleshoot and improve the performance of data systems and pipelines.
- Participate in code reviews and help establish engineering best practices.
- Mentor junior data engineers and support their technical development.
Qualifications :
Requirements
- Bachelors or masters degree in computer science Engineering or a related field.
- 5 years of handson experience in data engineering with at least 2 years working with AWS Databricks.
- Strong programming skills in Python for data processing and automation.
- Advanced proficiency in SQL for querying and transforming large datasets.
- Deep experience with Apache Spark/PySpark in a distributed computing environment.
- Solid understanding of data modelling warehousing and performance optimization techniques.
- Proficiency with AWS services such as Glue S3 Lambda and EMR.
- Experience with version control Git or Code commit
- Experience in any workflow orchestration like Airflow AWS Step funtions is a plu
Remote Work :
No
Employment Type :
Fulltime