Role Name - Lead PySpark Engineer
ROLEDESCRIPTION -
10 years of experience in big data and distributed computing.
Very Strong hands-on experience with PySpark Apache Spark and Python.
Strong Hands on experience with SQL and NoSQL databases (DB2 PostgreSQL Snowflake etc.).
Proficiency in data modeling and ETL workflows.
Proficiency with workflow schedulers like Airflow
Hands on experience with AWS cloud-based data platforms.
Experience in DevOps CI/CD pipelines and containerization (Docker Kubernetes) is a plus.
Strong problem-solving skills and ability to lead a team
Lead the design development and deployment of PySpark-based big data solutions.
Architect and optimize ETL pipelines for structured and unstructured data.
Collaborate with Client data engineers data scientists and business teams to understand requirements and provide scalable solutions.
Optimize Spark performance through partitioning caching and tuning.
Implement best practices in data engineering (CI/CD version control unit testing).
Work with cloud platforms like AWS
Ensure data security governance and compliance.
Mentor junior developers and review code for best practices and efficiency
Role Name - Lead PySpark EngineerROLEDESCRIPTION - 10 years of experience in big data and distributed computing. Very Strong hands-on experience with PySpark Apache Spark and Python. Strong Hands on experience with SQL and NoSQL databases (DB2 PostgreSQL Snowflake etc.). Proficiency in data modeling...
Role Name - Lead PySpark Engineer
ROLEDESCRIPTION -
10 years of experience in big data and distributed computing.
Very Strong hands-on experience with PySpark Apache Spark and Python.
Strong Hands on experience with SQL and NoSQL databases (DB2 PostgreSQL Snowflake etc.).
Proficiency in data modeling and ETL workflows.
Proficiency with workflow schedulers like Airflow
Hands on experience with AWS cloud-based data platforms.
Experience in DevOps CI/CD pipelines and containerization (Docker Kubernetes) is a plus.
Strong problem-solving skills and ability to lead a team
Lead the design development and deployment of PySpark-based big data solutions.
Architect and optimize ETL pipelines for structured and unstructured data.
Collaborate with Client data engineers data scientists and business teams to understand requirements and provide scalable solutions.
Optimize Spark performance through partitioning caching and tuning.
Implement best practices in data engineering (CI/CD version control unit testing).
Work with cloud platforms like AWS
Ensure data security governance and compliance.
Mentor junior developers and review code for best practices and efficiency
View more
View less