Data Engineer with Python,scala, pyspark

Pittsburgh, PA - USA

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Job description:

Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala.

Worked with data processing ETL transformation logic Strong SQL.

Prior experience in financial domain and entityidentity resolution is preferred

Design and develop scalable data pipelines using Apache Spark and PySpark ensuring high performance and efficiency

Optimize Spark jobs and clusters for performance tuning resource utilization and cost efficiency

Implement and manage data storage solutions on Hadoop Distributed File System HDFS and cloud storage services like Amazon S3

Utilize Scala or Java to write and optimize core data processing logic complex transformations and highly performant backend services

Ensure strict data quality governance and validation checks are integrated into all data pipelines to maintain accuracy and reliability of the data ecosystem

Deploy manage and scale data applications and services using various AWS compute resources including Amazon EC2 instances AWS EMR clusters and containerized environments via Amazon ECS

Design and develop robust highperformance RESTful APIs and microservices using ScalaJava to facilitate realtime data access and transactional services

Skills

Mandatory Skills : ETL Concepts Python Scala

Job description: Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala. Worked with data processing ETL transformation logic Strong SQL. Prior experience in financial domain and entityidentity resolution is preferred Design and develop scalab...