We are looking for an experienced PySpark Developer to design build and support scalable data pipelines and big data solutions. The role involves working on large datasets using Apache Spark (PySpark) supporting data transformation performance optimization and production support in an enterprise environment.
Design develop and maintain data pipelines using PySpark and Spark SQL
Perform data ingestion transformation and processing from multiple source systems
Develop and optimize Spark DataFrame and SQL based transformations
Work with Hadoop ecosystem components such as Hive HDFS and Impala
Ensure data quality validation and reconciliation across source and target systems
Troubleshoot and resolve performance and production issues
Strong hands on experience in PySpark
Good experience with Apache Spark & Spark SQL
Strong Python programming skills
Job Summary: We are looking for an experienced PySpark Developer to design build and support scalable data pipelines and big data solutions. The role involves working on large datasets using Apache Spark (PySpark) supporting data transformation performance optimization and production support in...
We are looking for an experienced PySpark Developer to design build and support scalable data pipelines and big data solutions. The role involves working on large datasets using Apache Spark (PySpark) supporting data transformation performance optimization and production support in an enterprise environment.
Design develop and maintain data pipelines using PySpark and Spark SQL
Perform data ingestion transformation and processing from multiple source systems
Develop and optimize Spark DataFrame and SQL based transformations
Work with Hadoop ecosystem components such as Hive HDFS and Impala
Ensure data quality validation and reconciliation across source and target systems
Troubleshoot and resolve performance and production issues
Strong hands on experience in PySpark
Good experience with Apache Spark & Spark SQL
Strong Python programming skills
View more
View less