Lead Python Data Engineer
We are seeking a passionate and highly skilled Lead Python Data Engineer to spearhead the development and maintenance of our cuttingedge data infrastructure. As a leader in our data team youll play a pivotal role in designing building and optimizing data pipelines that empower our organization with insightful and actionable data. If you thrive in a fastpaced environment possess a strong collaborative spirit and are eager to make a significant impact we encourage you to apply.
Objectives of this role:
- Design architect and implement robust and scalable data pipelines using Python and related technologies (Airflow PySpark PyFlink are a plus).
- Champion best practices for data engineering code quality testing and deployment.
- Mentor and guide a team of talented data engineers fostering a collaborative and highperforming team culture.
- Collaborate closely with Data Scientists Data Analysts and business stakeholders to translate complex business requirements into efficient data solutions.
- Continuously research and implement new technologies and best practices to improve the efficiency and scalability of our data platform.
- Take ownership of the deployment and monitoring of data pipelines and related infrastructure on cloud platforms such as Openshift ECS or Kubernetes.
Responsibilities:
- Lead the design and development of data pipelines for ingestion transformation and loading of data from various sources (databases APIs streaming platforms) into our data warehouse/lake.
- Write optimized and maintainable SQL queries and leverage SQLAlchemy for efficient database interaction.
- Implement robust data quality checks and monitoring systems to ensure data integrity and accuracy.
- Develop comprehensive documentation and contribute to knowledge sharing within the team.
- Contribute to the design and implementation of data governance policies and procedures.
Required skills and qualifications:
- 10 years of handson experience in a Data Engineering role with a strong proficiency in Python (version 3.6).
- Extensive experience working with relational databases and writing complex SQL queries.
- Proven expertise with SQLAlchemy or similar ORM libraries.
- Experience with workflow management tools like Airflow (experience with PySpark or PyFlink is a major plus).
- Solid understanding of data warehousing concepts and experience working with large datasets.
- Ability to guide and mentor junior developers fostering a collaborative team environment.
- Strong communication skills both written and verbal with the ability to explain complex technical concepts to both technical and nontechnical audiences.
- Experience deploying and managing applications on cloud platforms like Openshift ECS or Kubernetes.