The Role:
We are looking for a highly skilled Data Engineer with strong experience in Python and Databricks to join our growing data team. The ideal candidate will be responsible for designing building and optimizing scalable data pipelines ensuring data quality and enabling advanced analytics and business insights.
Responsibilities:
Design develop and maintain scalable ETL/ELT pipelines using Python and Databricks.Integrate data from multiple sources (structured and unstructured) into centralized platforms.Optimize and automate data workflows to improve performance and reliability.Implement and enforce best practices for data quality governance and security.Collaborate with data scientists analysts and business stakeholders to deliver actionable insights.Monitor troubleshoot and resolve data-related issues.Support migration and modernization initiatives leveraging cloud platforms (Azure/AWS/GCP).
Requirements:
Proven experience as a Data Engineer or in a similar role.Strong proficiency in Python for data manipulation automation and pipeline development.Hands-on experience with Databricks (data pipelines Delta Lake notebooks ML integration).Solid knowledge of SQL and experience working with relational and NoSQL databases.Experience with cloud data platforms (Azure Data Lake AWS S3 or GCP BigQuery).Familiarity with data modeling data warehousing and ETL concepts.Understanding of Agile/SCRUM methodologies.Strong problem-solving and communication skills.Bachelors degree in Computer Science Engineering or a related field (or equivalent experience).
Nice to Have
- Experience with Spark or other big data frameworks.
- Exposure to machine learning pipelines and AI-driven data projects.
- Knowledge of DevOps/DataOps practices (CI/CD Git containerization).
- Experience with BI/Visualization tools (Power BI Tableau Looker).
C - VS -
Wakapi Web