About the Role
We are seeking an experienced Data Engineer to design build and maintain scalable data pipelines and solutions on AWS. You will work closely with data scientists analysts and software engineers to ensure seamless data integration transformation and analytics.
Key Responsibilities
- Design and develop ETL/ELT pipelines using AWS services such as Glue Lambda Step Functions EMR.
- Build and optimize data lakes and data warehouses on S3 Redshift Athena and Lake Formation.
- Work with streaming data using Kinesis Kafka or MSK for real-time processing.
- Implement data security governance and compliance best practices (IAM KMS encryption GDPR HIPAA).
- Optimize query performance and cost in Redshift Athena and Glue.
- Automate data workflows with Airflow Step Functions or MWAA.
- Work with structured and unstructured data across various formats (Parquet ORC JSON Avro CSV).
- Collaborate with DevOps teams to deploy data solutions using Terraform CloudFormation or CDK.
- Implement monitoring and logging using CloudWatch AWS X-Ray or third-party tools.
Required Skills & Qualifications
- 2 years of experience in data engineering with AWS.
- Strong expertise in Python SQL or Scala.
- Hands-on experience with AWS Glue Redshift S3 Lambda Athena EMR and Kinesis.
- Experience in data modeling warehouse design and schema optimization.
- Proficiency in Spark (PySpark) Pandas and other data processing frameworks.
- Familiarity with CI/CD pipelines for data workflows.
- Strong understanding of AWS security best practices (IAM VPC encryption).
- (Optional but it adds the value)Experience with AWS Databricks Snowflake.
s3,glue,pipelines,redshift,ci/cd,spark (pyspark),scala,data,athena,pandas,emr,sql,data engineering,aws glue,data modeling,schema optimization,aws,kinesis,lambda,warehouse design,python