Job description:
Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala.
Worked with data processing ETL transformation logic Strong SQL.
Prior experience in financial domain and entityidentity resolution is preferred
Design and develop scalable data pipelines using Apache Spark and PySpark ensuring high performance and efficiency
Optimize Spark jobs and clusters for performance tuning resource utilization and cost efficiency
Implement and manage data storage solutions on Hadoop Distributed File System HDFS and cloud storage services like Amazon S3
Utilize Scala or Java to write and optimize core data processing logic complex transformations and highly performant backend services
Ensure strict data quality governance and validation checks are integrated into all data pipelines to maintain accuracy and reliability of the data ecosystem
Deploy manage and scale data applications and services using various AWS compute resources including Amazon EC2 instances AWS EMR clusters and containerized environments via Amazon ECS
Design and develop robust highperformance RESTful APIs and microservices using ScalaJava to facilitate realtime data access and transactional services
Skills
Mandatory Skills : ETL Concepts Python Scala
Job description: Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala. Worked with data processing ETL transformation logic Strong SQL. Prior experience in financial domain and entityidentity resolution is preferred Design and develop scalab...
Job description:
Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala.
Worked with data processing ETL transformation logic Strong SQL.
Prior experience in financial domain and entityidentity resolution is preferred
Design and develop scalable data pipelines using Apache Spark and PySpark ensuring high performance and efficiency
Optimize Spark jobs and clusters for performance tuning resource utilization and cost efficiency
Implement and manage data storage solutions on Hadoop Distributed File System HDFS and cloud storage services like Amazon S3
Utilize Scala or Java to write and optimize core data processing logic complex transformations and highly performant backend services
Ensure strict data quality governance and validation checks are integrated into all data pipelines to maintain accuracy and reliability of the data ecosystem
Deploy manage and scale data applications and services using various AWS compute resources including Amazon EC2 instances AWS EMR clusters and containerized environments via Amazon ECS
Design and develop robust highperformance RESTful APIs and microservices using ScalaJava to facilitate realtime data access and transactional services
Skills
Mandatory Skills : ETL Concepts Python Scala
View more
View less