Job Title: Data Engineer (AWS Python PySpark)
Location: Richmond VA or Dallas TX (Hybrid)
Duration: Long Term Contract
Rate: $50/hr on C2C
Interview Mode: Face-to-Face (Client Round)
Client Domain: Confidential
Job Description
We are seeking an experienced Data Engineer with strong expertise in AWS Python and PySpark to join our client s data engineering team. The ideal candidate will be responsible for building scalable data pipelines optimizing ETL workflows and managing data processing across distributed systems. This role requires hands-on experience in big data technologies data modeling and cloud-based data solutions in a hybrid work environment.
Key Responsibilities
Design develop and maintain scalable ETL pipelines using Python and PySpark on AWS.
Ingest transform and process large-scale structured and unstructured data from diverse sources.
Work with AWS services such as S3 Glue EMR Lambda Redshift and Athena.
Optimize data storage retrieval and query performance for analytics and reporting.
Implement and maintain data quality validation and governance processes.
Collaborate with data scientists analysts and application teams to deliver end-to-end data solutions.
Automate workflows monitor data pipelines and resolve data-related issues proactively.
Participate in Agile ceremonies including sprint planning reviews and stand-ups.
Required Skills & Qualifications
6 years of experience as a Data Engineer.
Strong programming skills in Python and PySpark.
Proven experience with AWS data services (Glue S3 Lambda EMR Redshift).
Hands-on experience with ETL design data pipelines and data lake architectures.
Familiarity with SQL and relational databases (PostgreSQL MySQL or similar).
Experience with version control tools like Git and CI/CD pipelines.
Strong analytical and problem-solving skills.
Excellent communication and teamwork abilities.