Senior Data Engineer
Job Summary
Leads the implementation of ingestion pipelines CDC and ELT transformation layers. Drives design decisions mentoring and quality assurance for data pipelines.
- Expert in PySpark and AWS Glue
- Strong SQL experience with Redshift or equivalent Massive Parallel Processing DB
- Data Lake integration with S3
- Schema evolution handling for semi-structured data
- Airflow DAG development for orchestration
- Experience with DQ and CDC implementation
- Familiarity with EMR and Spark performance tuning
- Experience with Redshift Spectrum
- AWS (Glue S3 Redshift Lambda)
- Apache Airflow
- Git GitLab CI