We are seeking a highly experienced and hands-on Senior Data Engineer to join our Data Engineering teams. You will play a key role in supplementing existing capacity upgrading our data architecture and ensuring the highest quality performance and cost-efficiency of our data platforms. The work is focused on critical deliverables for personal investment personal wealth and comprehensive data analytics while preparing the platform for a larger strategic move in the future.
Key Responsibilities
Design build and maintain high-performance ETL/ELT data pipelines using Python and PySpark.
Apply expert-level coding skills to develop and manage data processing jobs leveraging PySpark for distributed computing across large-scale datasets.
Take full ownership of the data workflow including getting data from multiple sources scrubbing and validating data to ensure the highest quality.
Write and optimize complex performant SQL queries for data extraction integrity checks and performance tuning.
Contribute to platform modernization by exploring and increasing the adoption of AI/ML including using tools like Copilot and Claude for acceleration and building models to fill data gaps or improve systems.
Collaborate with data architects by proposing ideas and great questions taking ownership as the expert on data pipelines and systems.
Implement DevOps practices for the automated deployment and orchestration of Python applications and data pipelines (e.g. using Docker Jenkins Terraform).
Hands on experience with SQL and complex performance tuning.
Required Technical Skills
Programming: Expert-level proficiency in Python including libraries like Pandas and NumPy.
Designing: Designing data pipelines for the data coming from multiple sources
Data Processing: Solid hands-on experience with PySpark for building scalable data workflows
Data Querying: Expert-level knowledge of writing complex SQL queries (Oracle or Snowflake) with proven ability to perform performance tuning on large datasets and complex database code.
Cloud Platform: Robust experience with AWS cloud services and associated data services specifically:
AWS Glue (ETL)
S3
Lambda
Redshift
DynamoDB Athena ECS EventBridge OpenSearch RDS
ETL & Data Management: Robust proficiency in ETL/ELT methodologies and tools as well as Data Quality Data Validation and Anomaly Detection techniques.
Scripting: Working experience with scripting and automation using Unix and Python.
Desired Skills & Professional Attributes
Familiarity with AI/ML and Large Language Model (LLM) approaches to data analysis and validation.
Knowledge of data warehousing concepts and data modeling techniques.
Experience with DevOps Continuous Integration and Continuous Delivery (e.g. Jenkins GitHub).
Experience with BI Reporting tools such as Power BI or Tableau.
Robust preference for candidates with prior experience in the investment data domain.
Ability to work independently through complex data challenges and robust analytical and problem-solving skills.
Sr. AWS Data Engineer Malvern PA Any visa 10 years of experience required... Job Description We are seeking a highly experienced and hands-on Senior Data Engineer to join our Data Engineering teams. You will play a key role in supplementing existing capacity upgrading our data architectu...
Sr. AWS Data Engineer
Malvern PA
Any visa
10 years of experience required...
Job Description
We are seeking a highly experienced and hands-on Senior Data Engineer to join our Data Engineering teams. You will play a key role in supplementing existing capacity upgrading our data architecture and ensuring the highest quality performance and cost-efficiency of our data platforms. The work is focused on critical deliverables for personal investment personal wealth and comprehensive data analytics while preparing the platform for a larger strategic move in the future.
Key Responsibilities
Design build and maintain high-performance ETL/ELT data pipelines using Python and PySpark.
Apply expert-level coding skills to develop and manage data processing jobs leveraging PySpark for distributed computing across large-scale datasets.
Take full ownership of the data workflow including getting data from multiple sources scrubbing and validating data to ensure the highest quality.
Write and optimize complex performant SQL queries for data extraction integrity checks and performance tuning.
Contribute to platform modernization by exploring and increasing the adoption of AI/ML including using tools like Copilot and Claude for acceleration and building models to fill data gaps or improve systems.
Collaborate with data architects by proposing ideas and great questions taking ownership as the expert on data pipelines and systems.
Implement DevOps practices for the automated deployment and orchestration of Python applications and data pipelines (e.g. using Docker Jenkins Terraform).
Hands on experience with SQL and complex performance tuning.
Required Technical Skills
Programming: Expert-level proficiency in Python including libraries like Pandas and NumPy.
Designing: Designing data pipelines for the data coming from multiple sources
Data Processing: Solid hands-on experience with PySpark for building scalable data workflows
Data Querying: Expert-level knowledge of writing complex SQL queries (Oracle or Snowflake) with proven ability to perform performance tuning on large datasets and complex database code.
Cloud Platform: Robust experience with AWS cloud services and associated data services specifically:
AWS Glue (ETL)
S3
Lambda
Redshift
DynamoDB Athena ECS EventBridge OpenSearch RDS
ETL & Data Management: Robust proficiency in ETL/ELT methodologies and tools as well as Data Quality Data Validation and Anomaly Detection techniques.
Scripting: Working experience with scripting and automation using Unix and Python.
Desired Skills & Professional Attributes
Familiarity with AI/ML and Large Language Model (LLM) approaches to data analysis and validation.
Knowledge of data warehousing concepts and data modeling techniques.
Experience with DevOps Continuous Integration and Continuous Delivery (e.g. Jenkins GitHub).
Experience with BI Reporting tools such as Power BI or Tableau.
Robust preference for candidates with prior experience in the investment data domain.
Ability to work independently through complex data challenges and robust analytical and problem-solving skills.