About :
Virtusa is a global IT services company offering digital transformation engineering and consulting helping businesses with cloud AI and legacy system modernization across industries like finance healthcare and tech. Founded in Sri Lanka in 1996 its now headquartered in Massachusetts serving major clients worldwide through its tech expertise and strategic partnerships like with Google Cloud. The company focuses on product development platform engineering and making experiences better with technology though employee reviews highlight varied experiences.
Role :
We are looking for a Python / PySpark Developer to build and maintain big data processing pipelines. The role involves designing and implementing ETL/ELT workflows transforming large datasets and enabling analytics and reporting solutions on distributed computing platforms.
Skills :
Python and PySpark knowledge of distributed data processing frameworks (Apache Spark).Experience with SQL and relational/non-relational with big data platforms such as Hadoop Databricks or of data pipelines batch/stream processing and ETL/ELT problem-solving and analytical skills.
Key Responsibilities:
- Develop and maintain ETL/ELT pipelines using Python and large-scale structured and unstructured data from multiple sources.
- Write optimized PySpark jobs for batch and streaming data with data engineers analysts and architects to understand data requirements.
- Ensure data quality validation and integrity in pipelines.
- Monitor and troubleshoot data workflows and Spark jobs for performance issues.
- Document data pipelines workflows and transformations for reference and pipelines with cloud storage and data warehouse platforms (AWS S3 Redshift Azure Data Lake etc.)
About : Virtusa is a global IT services company offering digital transformation engineering and consulting helping businesses with cloud AI and legacy system modernization across industries like finance healthcare and tech. Founded in Sri Lanka in 1996 its now headquartered in Massachusetts serving...
About :
Virtusa is a global IT services company offering digital transformation engineering and consulting helping businesses with cloud AI and legacy system modernization across industries like finance healthcare and tech. Founded in Sri Lanka in 1996 its now headquartered in Massachusetts serving major clients worldwide through its tech expertise and strategic partnerships like with Google Cloud. The company focuses on product development platform engineering and making experiences better with technology though employee reviews highlight varied experiences.
Role :
We are looking for a Python / PySpark Developer to build and maintain big data processing pipelines. The role involves designing and implementing ETL/ELT workflows transforming large datasets and enabling analytics and reporting solutions on distributed computing platforms.
Skills :
Python and PySpark knowledge of distributed data processing frameworks (Apache Spark).Experience with SQL and relational/non-relational with big data platforms such as Hadoop Databricks or of data pipelines batch/stream processing and ETL/ELT problem-solving and analytical skills.
Key Responsibilities:
- Develop and maintain ETL/ELT pipelines using Python and large-scale structured and unstructured data from multiple sources.
- Write optimized PySpark jobs for batch and streaming data with data engineers analysts and architects to understand data requirements.
- Ensure data quality validation and integrity in pipelines.
- Monitor and troubleshoot data workflows and Spark jobs for performance issues.
- Document data pipelines workflows and transformations for reference and pipelines with cloud storage and data warehouse platforms (AWS S3 Redshift Azure Data Lake etc.)
View more
View less