"AWS Data Engineer Hadoop, Spark, Scala, AWS DataLake Netezza"

Toronto - Canada

Monthly Salary: Not Disclosed

Experience Required: 5years

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

AWS Data Engineer - Hadoop Spark Scala AWS DataLake *Netezza

Role and Responsibilities

Understand business requirements from product owners and convert them into technical scope and requirement documents
Design end-to-end data ingestion and transformation solutions using Hadoop ecosystem (Spark Spark Streaming Hive etc.)
Create technical design documentation and mentor team members on implementation
Develop scalable solutions using Spark on AWS Data Lake
Build reusable frameworks for data engineering on AWS using services like S3 EMR Glue etc.
Coordinate with cross-functional teams (upstream/downstream) for production deployments
Provide post-production support and bug fixes as needed
Interpret and migrate existing Netezza/Hadoop features into AWS Data Lake architecture
Assist QA/SIT teams with unit testing functional testing and migration activities
Work with stakeholders to define reusable design patterns for data onboarding and integration

5 MUST-HAVE Skills & Experience

Strong hands-on experience with Hadoop ecosystem particularly Hive and Spark with Scala
Extensive experience in data engineering on AWS including S3 EMR Glue Redshift Lake Formation and Python
Proficiency in PySpark and building batch workloads on Hadoop and AWS platforms
Experience with code versioning and deployment tools like Bitbucket Artifactory and AWS CodePipeline
Understanding and implementation of data encryption techniques and secure data handling

5 NICE-TO-HAVE Skills & Experience

Experience handling terabyte/petabyte-scale data and processing millions of transactions per day
Building and orchestrating ETL pipelines using Apache Airflow
Knowledge of Spark Streaming or similar streaming technologies
Proficiency in Scala or Java and comfort with Linux-based environments
Familiarity with AWS services like Secrets Manager KMS Lambda and Pythonic pipeline design principles

Experience Required

6 years of hands-on experience in data engineering data warehouse and data lake platforms

AWS Data Engineer - Hadoop Spark Scala AWS DataLake *NetezzaRole and ResponsibilitiesUnderstand business requirements from product owners and convert them into technical scope and requirement documentsDesign end-to-end data ingestion and transformation solutions using Hadoop ecosystem (Spark Spark ...