Experience: 6 8 Years
Location: Pune
Employment Type: Full-Time
Job Summary:
We are looking for a skilled and experienced PySpark Data Engineer to join our growing data engineering team. The ideal candidate will have 68 years of experience in designing and implementing data pipelines using PySpark AWS Glue and Apache Airflow with strong proficiency in SQL. You will be responsible for building scalable data processing solutions optimizing data workflows and collaborating with cross-functional teams to deliver high-quality data assets.
Key Responsibilities:
Design develop and maintain large-scale ETL pipelines using PySpark and AWS Glue.
Orchestrate and schedule data workflows using Apache Airflow.
Optimize data processing jobs for performance and cost-efficiency.
Work with large datasets from various sources ensuring data quality and consistency.
Collaborate with Data Scientists Analysts and other Engineers to understand data requirements and deliver solutions.
Write efficient reusable and well-documented code following best practices.
Monitor data pipeline health and performance; resolve data-related issues proactively.
Participate in code reviews architecture discussions and performance tuning.
Requirements
Requirements:
6 - 8 years of experience in data engineering roles.
Strong expertise in PySpark for distributed data processing.
Hands-on experience with AWS Glue and other AWS data services (S3 Athena Lambda etc.).
Experience with Apache Airflow for workflow orchestration.
Strong proficiency in SQL for data extraction transformation and analysis.
Familiarity with data modeling concepts and data lake/data warehouse architectures.
Experience with version control systems (e.g. Git) and CI/CD processes.
Ability to write clean scalable and production-grade code.
Benefits
Benefits
Company standard benefits.
Required Skills:
Requirements- Bachelors or masters degree in computer science Engineering or a related field. Previous experience of 5 years in Project management roles preferably within software development projects. Strong leadership and team management skills with the ability to motivate and guide technical teams towards project -depth knowledge of software development methodologies including Agile Scrum or Kanban. Excellent communication skills with the ability to interact effectively with technical teams stakeholders and senior management. Experience in managing multiple projects simultaneously and prioritizing tasks based on business needs. Strong problem-solving and decision-making abilities with a focus on delivering solutions to complex technical challenges. Familiarity with project management tools and software development lifecycle tools. Certifications in project management (PMP PMI-ACP) or Agile methodologies (Scrum Master SAFe) are a plus.
Experience: 6 8 YearsLocation: PuneEmployment Type: Full-Time Job Summary:We are looking for a skilled and experienced PySpark Data Engineer to join our growing data engineering team. The ideal candidate will have 68 years of experience in designing and implementing data pipelines using PySpark AWS...
Experience: 6 8 Years
Location: Pune
Employment Type: Full-Time
Job Summary:
We are looking for a skilled and experienced PySpark Data Engineer to join our growing data engineering team. The ideal candidate will have 68 years of experience in designing and implementing data pipelines using PySpark AWS Glue and Apache Airflow with strong proficiency in SQL. You will be responsible for building scalable data processing solutions optimizing data workflows and collaborating with cross-functional teams to deliver high-quality data assets.
Key Responsibilities:
Design develop and maintain large-scale ETL pipelines using PySpark and AWS Glue.
Orchestrate and schedule data workflows using Apache Airflow.
Optimize data processing jobs for performance and cost-efficiency.
Work with large datasets from various sources ensuring data quality and consistency.
Collaborate with Data Scientists Analysts and other Engineers to understand data requirements and deliver solutions.
Write efficient reusable and well-documented code following best practices.
Monitor data pipeline health and performance; resolve data-related issues proactively.
Participate in code reviews architecture discussions and performance tuning.
Requirements
Requirements:
6 - 8 years of experience in data engineering roles.
Strong expertise in PySpark for distributed data processing.
Hands-on experience with AWS Glue and other AWS data services (S3 Athena Lambda etc.).
Experience with Apache Airflow for workflow orchestration.
Strong proficiency in SQL for data extraction transformation and analysis.
Familiarity with data modeling concepts and data lake/data warehouse architectures.
Experience with version control systems (e.g. Git) and CI/CD processes.
Ability to write clean scalable and production-grade code.
Benefits
Benefits
Company standard benefits.
Required Skills:
Requirements- Bachelors or masters degree in computer science Engineering or a related field. Previous experience of 5 years in Project management roles preferably within software development projects. Strong leadership and team management skills with the ability to motivate and guide technical teams towards project -depth knowledge of software development methodologies including Agile Scrum or Kanban. Excellent communication skills with the ability to interact effectively with technical teams stakeholders and senior management. Experience in managing multiple projects simultaneously and prioritizing tasks based on business needs. Strong problem-solving and decision-making abilities with a focus on delivering solutions to complex technical challenges. Familiarity with project management tools and software development lifecycle tools. Certifications in project management (PMP PMI-ACP) or Agile methodologies (Scrum Master SAFe) are a plus.
View more
View less