Position: Data Engineer (Python Pyspark & AWS)
Location: McLean VA
Duration: Long term contract
Role Summary:
We are seeking an experienced Data Engineer with strong expertise in Python PySpark and AWS cloud data services. The ideal candidate will design build and optimize scalable data pipelines ensuring high-quality data availability for analytics reporting and business operations. This role requires hands-on development strong problem-solving skills and experience working with large-scale distributed systems and data platforms.
Key Responsibilities
- Design develop and maintain ETL/ELT pipelines using Python and PySpark
- Build and optimize data ingestion transformation and processing frameworks
- Work with AWS cloud services including S3 Glue EMR Lambda Redshift Athena DynamoDB etc.
- Partner with data architects analysts and BI teams to deliver high-quality data solutions
- Perform data profiling quality checks and validation for accuracy and consistency
- Automate data workflows and improve data pipeline performance
- Implement best practices for security monitoring version control and CI/CD
- Troubleshoot complex data and pipeline issues in a distributed environment
- Document solutions data dictionaries lineage and technical workflows
Required Skills & Qualifications
- 12 years of hands-on data engineering experience
- Strong programming skills in Python including data structures and OOP
- Deep expertise with PySpark for distributed data processing
- Proficiency with AWS Cloud data ecosystem
- (S3 Glue EMR Lambda Redshift Athena Step Functions IAM)
- Strong SQL experience and optimization techniques
- Hands-on experience with ETL/ELT pipeline development
- Experience with Docker Git and CI/CD tools
- Understanding of data modeling schema design (Star/Snowflake)
- Experience working in Agile/Scrum environments
Position: Data Engineer (Python Pyspark & AWS) Location: McLean VA Duration: Long term contract Role Summary: We are seeking an experienced Data Engineer with strong expertise in Python PySpark and AWS cloud data services. The ideal candidate will design build and optimize scalable data pipelin...
Position: Data Engineer (Python Pyspark & AWS)
Location: McLean VA
Duration: Long term contract
Role Summary:
We are seeking an experienced Data Engineer with strong expertise in Python PySpark and AWS cloud data services. The ideal candidate will design build and optimize scalable data pipelines ensuring high-quality data availability for analytics reporting and business operations. This role requires hands-on development strong problem-solving skills and experience working with large-scale distributed systems and data platforms.
Key Responsibilities
- Design develop and maintain ETL/ELT pipelines using Python and PySpark
- Build and optimize data ingestion transformation and processing frameworks
- Work with AWS cloud services including S3 Glue EMR Lambda Redshift Athena DynamoDB etc.
- Partner with data architects analysts and BI teams to deliver high-quality data solutions
- Perform data profiling quality checks and validation for accuracy and consistency
- Automate data workflows and improve data pipeline performance
- Implement best practices for security monitoring version control and CI/CD
- Troubleshoot complex data and pipeline issues in a distributed environment
- Document solutions data dictionaries lineage and technical workflows
Required Skills & Qualifications
- 12 years of hands-on data engineering experience
- Strong programming skills in Python including data structures and OOP
- Deep expertise with PySpark for distributed data processing
- Proficiency with AWS Cloud data ecosystem
- (S3 Glue EMR Lambda Redshift Athena Step Functions IAM)
- Strong SQL experience and optimization techniques
- Hands-on experience with ETL/ELT pipeline development
- Experience with Docker Git and CI/CD tools
- Understanding of data modeling schema design (Star/Snowflake)
- Experience working in Agile/Scrum environments
View more
View less