Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailWe are an IT Solutions Integrator/Consulting Firm helping our clients hire the right professional for an exciting long term project. Here are a few details.
We are seeking a highly skilled and motivated Python Developer with strong expertise in PySpark and AWS to join our data engineering team. The ideal candidate will be responsible for building scalable data pipelines transforming large volumes of data and deploying data solutions in cloud environments. You will collaborate with crossfunctional teams to design develop and implement highperformance reliable and scalable data processing systems.
Design develop and maintain efficient reusable and reliable Python code.
Develop scalable data processing pipelines using PySpark for structured and semistructured data.
Build and automate data workflows and ETL pipelines using AWS services such as S3 Glue Lambda EMR and Step Functions.
Optimize data processing for performance scalability and reliability.
Participate in architecture design discussions and contribute to technical decisionmaking.
Integrate with data sources like RDBMS NoSQL and REST APIs.
Implement data quality checks monitoring and logging for production pipelines.
Work closely with data analysts architects and DevOps teams to ensure seamless data flow and integration.
Perform unit testing debugging and performance tuning of code.
Maintain documentation for all developed components and processes.
Bachelor s or Master s degree in Computer Science Engineering or a related field.
4 years of experience in Python programming for data engineering or backend development.
Strong handson experience with PySpark (RDD DataFrame APIs Spark SQL performance tuning).
Proficient in using AWS services like S3 Glue Lambda EMR Athena and CloudWatch.
Good understanding of distributed computing and parallel data processing.
Experience working with largescale datasets and batch/streaming data pipelines.
Familiarity with SQL and data modeling concepts.
Knowledge of CI/CD tools and source control (e.g. Git Jenkins).
Solid understanding of software engineering best practices and Agile methodologies.
AWS certification (e.g. AWS Certified Developer or Data Analytics Specialty).
Experience with containerization (Docker) and orchestration tools (Kubernetes).
Familiarity with data lake and data warehouse concepts (e.g. Redshift Snowflake).
Exposure to Apache Airflow or other workflow orchestration tools.
We are seeking a highly skilled and motivated Python Developer with strong expertise in PySpark and AWS to join our data engineering team. The ideal candidate will be responsible for building scalable data pipelines transforming large volumes of data and deploying data solutions in cloud environments. You will collaborate with crossfunctional teams to design develop and implement highperformance reliable and scalable data processing systems.
Design develop and maintain efficient reusable and reliable Python code.
Develop scalable data processing pipelines using PySpark for structured and semistructured data.
Build and automate data workflows and ETL pipelines using AWS services such as S3 Glue Lambda EMR and Step Functions.
Optimize data processing for performance scalability and reliability.
Participate in architecture design discussions and contribute to technical decisionmaking.
Integrate with data sources like RDBMS NoSQL and REST APIs.
Implement data quality checks monitoring and logging for production pipelines.
Work closely with data analysts architects and DevOps teams to ensure seamless data flow and integration.
Perform unit testing debugging and performance tuning of code.
Maintain documentation for all developed components and processes.
Bachelor s or Master s degree in Computer Science Engineering or a related field.
4 years of experience in Python programming for data engineering or backend development.
Strong handson experience with PySpark (RDD DataFrame APIs Spark SQL performance tuning).
Proficient in using AWS services like S3 Glue Lambda EMR Athena and CloudWatch.
Good understanding of distributed computing and parallel data processing.
Experience working with largescale datasets and batch/streaming data pipelines.
Familiarity with SQL and data modeling concepts.
Knowledge of CI/CD tools and source control (e.g. Git Jenkins).
Solid understanding of software engineering best practices and Agile methodologies.
AWS certification (e.g. AWS Certified Developer or Data Analytics Specialty).
Experience with containerization (Docker) and orchestration tools (Kubernetes).
Familiarity with data lake and data warehouse concepts (e.g. Redshift Snowflake).
Exposure to Apache Airflow or other workflow orchestration tools.
Education
B.E/
Full Time