Job Title: Lead AWS PySpark Engineer
Location: Hyderabad
Experience: 9 Years
Job Overview
We are seeking a highly skilled Lead AWS PySpark Engineer to design develop and optimize large-scale data processing pipelines on AWS. The ideal candidate will have strong experience in PySpark distributed data processing and AWS data services along with the ability to lead technical initiatives and mentor data engineers.
Requirements
Key Responsibilities
Design build and maintain scalable data pipelines using PySpark on AWS.
Lead the development of ETL/ELT workflows for processing large volumes of structured and unstructured data.
Architect and optimize data solutions using AWS services such as S3 Glue Athena.
Collaborate with data scientists analysts and product teams to deliver high-quality data solutions.
Implement data quality checks monitoring and performance optimization for big data pipelines.
Lead code reviews enforce best practices and mentor junior engineers.
Work closely with DevOps teams to implement CI/CD pipelines and automated deployments.
Ensure compliance with data governance security and best practices.
Required Skills
9 years of experience in Data Engineering / Big Data Development.
Strong expertise in PySpark and Apache Spark.
Hands-on experience with AWS ecosystem (S3 Glue EMR Lambda Redshift Athena).
Proficiency in Python and SQL.
Experience with data pipeline orchestration tools (Airflow Step Functions etc.).
Strong knowledge of distributed computing and big data processing.
Experience with data modeling performance tuning and query optimization.
Familiarity with CI/CD tools Git and Agile development practices.
Benefits
Comprehensive Medical Coverage:
Health insurance of INR 7.0 Lakhs for you and your family (up to 6 members) ensuring complete peace of mind.
Robust Protection Plans:
Group Personal Accident Insurance and Group Term Life Insurance to safeguard you and your loved ones.
Retirement Benefits:
PF and Gratuity provided as per standard government regulations.
Flexible Work Options:
Enjoy hybrid work arrangements & flexible working hours
Generous Leave Policy:
21 days of annual leave in addition to 10 company-declared holidays.
Employee Well-being Spaces:
Access to a dedicated break-out area with round-the-clock refreshments for relaxation and rejuvenation.
Required Skills:
Key Responsibilities Design build and maintain scalable data pipelines using PySpark on AWS. Lead the development of ETL/ELT workflows for processing large volumes of structured and unstructured data. Architect and optimize data solutions using AWS services such as S3 Glue EMR Redshift and Lambda. Collaborate with data scientists analysts and product teams to deliver high-quality data solutions. Implement data quality checks monitoring and performance optimization for big data pipelines. Lead code reviews enforce best practices and mentor junior engineers. Work closely with DevOps teams to implement CI/CD pipelines and automated deployments. Ensure compliance with data governance security and best practices. Required Skills 9 years of experience in Data Engineering / Big Data Development. Strong expertise in PySpark and Apache Spark. Hands-on experience with AWS ecosystem (S3 Glue EMR Lambda Redshift Athena). Proficiency in Python and SQL. Experience with data pipeline orchestration tools (Airflow Step Functions etc.). Strong knowledge of distributed computing and big data processing. Experience with data modeling performance tuning and query optimization. Familiarity with CI/CD tools Git and Agile development practices.
Job Title: Lead AWS PySpark EngineerLocation: Hyderabad Experience: 9 YearsJob OverviewWe are seeking a highly skilled Lead AWS PySpark Engineer to design develop and optimize large-scale data processing pipelines on AWS. The ideal candidate will have strong experience in PySpark distributed data pr...
Job Title: Lead AWS PySpark Engineer
Location: Hyderabad
Experience: 9 Years
Job Overview
We are seeking a highly skilled Lead AWS PySpark Engineer to design develop and optimize large-scale data processing pipelines on AWS. The ideal candidate will have strong experience in PySpark distributed data processing and AWS data services along with the ability to lead technical initiatives and mentor data engineers.
Requirements
Key Responsibilities
Design build and maintain scalable data pipelines using PySpark on AWS.
Lead the development of ETL/ELT workflows for processing large volumes of structured and unstructured data.
Architect and optimize data solutions using AWS services such as S3 Glue Athena.
Collaborate with data scientists analysts and product teams to deliver high-quality data solutions.
Implement data quality checks monitoring and performance optimization for big data pipelines.
Lead code reviews enforce best practices and mentor junior engineers.
Work closely with DevOps teams to implement CI/CD pipelines and automated deployments.
Ensure compliance with data governance security and best practices.
Required Skills
9 years of experience in Data Engineering / Big Data Development.
Strong expertise in PySpark and Apache Spark.
Hands-on experience with AWS ecosystem (S3 Glue EMR Lambda Redshift Athena).
Proficiency in Python and SQL.
Experience with data pipeline orchestration tools (Airflow Step Functions etc.).
Strong knowledge of distributed computing and big data processing.
Experience with data modeling performance tuning and query optimization.
Familiarity with CI/CD tools Git and Agile development practices.
Benefits
Comprehensive Medical Coverage:
Health insurance of INR 7.0 Lakhs for you and your family (up to 6 members) ensuring complete peace of mind.
Robust Protection Plans:
Group Personal Accident Insurance and Group Term Life Insurance to safeguard you and your loved ones.
Retirement Benefits:
PF and Gratuity provided as per standard government regulations.
Flexible Work Options:
Enjoy hybrid work arrangements & flexible working hours
Generous Leave Policy:
21 days of annual leave in addition to 10 company-declared holidays.
Employee Well-being Spaces:
Access to a dedicated break-out area with round-the-clock refreshments for relaxation and rejuvenation.
Required Skills:
Key Responsibilities Design build and maintain scalable data pipelines using PySpark on AWS. Lead the development of ETL/ELT workflows for processing large volumes of structured and unstructured data. Architect and optimize data solutions using AWS services such as S3 Glue EMR Redshift and Lambda. Collaborate with data scientists analysts and product teams to deliver high-quality data solutions. Implement data quality checks monitoring and performance optimization for big data pipelines. Lead code reviews enforce best practices and mentor junior engineers. Work closely with DevOps teams to implement CI/CD pipelines and automated deployments. Ensure compliance with data governance security and best practices. Required Skills 9 years of experience in Data Engineering / Big Data Development. Strong expertise in PySpark and Apache Spark. Hands-on experience with AWS ecosystem (S3 Glue EMR Lambda Redshift Athena). Proficiency in Python and SQL. Experience with data pipeline orchestration tools (Airflow Step Functions etc.). Strong knowledge of distributed computing and big data processing. Experience with data modeling performance tuning and query optimization. Familiarity with CI/CD tools Git and Agile development practices.
View more
View less