Job Summary
We are looking for a skilled AWS Data Engineer with strong expertise in building scalable data pipelines using AWS services like Glue Redshift and Lambda. The ideal candidate should have deep knowledge of SQL PySpark and ETL processes with a focus on performance optimization and large-scale data processing.
Key Responsibilities
- Design develop and maintain scalable ETL pipelines using AWS services
- Build and optimize data workflows using AWS Glue Lambda and PySpark
- Develop and manage data warehousing solutions in Amazon Redshift
- Write complex and optimized SQL queries for data extraction transformation and analysis
- Implement data processing frameworks using Apache Spark (PySpark)
- Optimize data pipelines for performance scalability and cost efficiency
- Implement and maintain data models and data warehouse schemas
- Work with large datasets and ensure data quality consistency and integrity
- Implement incremental loads CDC (Change Data Capture) and SCD (Slowly Changing Dimensions)
- Monitor troubleshoot and enhance existing ETL jobs and workflows
- Collaborate with data analysts data scientists and business stakeholders
Required Skills
- Strong hands-on experience with:AWS GlueAmazon RedshiftAWS Lambda
- Expertise in SQLComplex joins aggregations Window functions Query performance tuning
- Strong experience in PySpark / Apache Spark Spark architecture understanding
- Performance optimization (partitioning caching joins etc.)
- Solid understanding of ETL concepts: SCD (Type 1 Type 2) Delta loads Change Data Capture (CDC)
- Experience in data warehousing concepts (Star/Snowflake schema)
- Strong problem-solving and analytical skills
Required Skills:
AWSGlueRedshiftLambdaPySparkSpark
Job Summary We are looking for a skilled AWS Data Engineer with strong expertise in building scalable data pipelines using AWS services like Glue Redshift and Lambda. The ideal candidate should have deep knowledge of SQL PySpark and ETL processes with a focus on performance optimization and large-sc...
Job Summary
We are looking for a skilled AWS Data Engineer with strong expertise in building scalable data pipelines using AWS services like Glue Redshift and Lambda. The ideal candidate should have deep knowledge of SQL PySpark and ETL processes with a focus on performance optimization and large-scale data processing.
Key Responsibilities
- Design develop and maintain scalable ETL pipelines using AWS services
- Build and optimize data workflows using AWS Glue Lambda and PySpark
- Develop and manage data warehousing solutions in Amazon Redshift
- Write complex and optimized SQL queries for data extraction transformation and analysis
- Implement data processing frameworks using Apache Spark (PySpark)
- Optimize data pipelines for performance scalability and cost efficiency
- Implement and maintain data models and data warehouse schemas
- Work with large datasets and ensure data quality consistency and integrity
- Implement incremental loads CDC (Change Data Capture) and SCD (Slowly Changing Dimensions)
- Monitor troubleshoot and enhance existing ETL jobs and workflows
- Collaborate with data analysts data scientists and business stakeholders
Required Skills
- Strong hands-on experience with:AWS GlueAmazon RedshiftAWS Lambda
- Expertise in SQLComplex joins aggregations Window functions Query performance tuning
- Strong experience in PySpark / Apache Spark Spark architecture understanding
- Performance optimization (partitioning caching joins etc.)
- Solid understanding of ETL concepts: SCD (Type 1 Type 2) Delta loads Change Data Capture (CDC)
- Experience in data warehousing concepts (Star/Snowflake schema)
- Strong problem-solving and analytical skills
Required Skills:
AWSGlueRedshiftLambdaPySparkSpark
View more
View less