Leads the implementation of ingestion pipelines CDC and ELT transformation layers. Drives design decisions mentoring and quality assurance for data pipelines.
- Expert in PySpark and AWS Glue
- Strong SQL experience with Redshift or equivalent Massive Parallel Processing DB
- Data Lake integration with S3
- Schema evolution handling for semi-structured data
- Airflow DAG development for orchestration
- Experience with DQ and CDC implementation
- Familiarity with EMR and Spark performance tuning
- Experience with Redshift Spectrum
- AWS (Glue S3 Redshift Lambda)
- Apache Airflow
- Git GitLab CI
Leads the implementation of ingestion pipelines CDC and ELT transformation layers. Drives design decisions mentoring and quality assurance for data pipelines. Expert in PySpark and AWS Glue Strong SQL experience with Redshift or equivalent Massive Parallel Processing DB Data Lake integration with S...
Leads the implementation of ingestion pipelines CDC and ELT transformation layers. Drives design decisions mentoring and quality assurance for data pipelines.
- Expert in PySpark and AWS Glue
- Strong SQL experience with Redshift or equivalent Massive Parallel Processing DB
- Data Lake integration with S3
- Schema evolution handling for semi-structured data
- Airflow DAG development for orchestration
- Experience with DQ and CDC implementation
- Familiarity with EMR and Spark performance tuning
- Experience with Redshift Spectrum
- AWS (Glue S3 Redshift Lambda)
- Apache Airflow
- Git GitLab CI
View more
View less