Data Engineer

Mumbai - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

We are looking for a talented and motivated Data Engineer with strong experience in PySpark and Python to design build and maintain scalable data pipelines and infrastructure. The successful candidate will support the delivery of data-driven insights by transforming raw data into clean curated datasets for analytics and machine learning applications. Java experience is a plus and will be useful in hybrid environments.

Key Responsibilities:

Develop and optimize robust scalable data pipelines using PySpark and Python
Clean transform and enrich large-scale datasets from structured and unstructured sources
Implement data ingestion ETL/ELT workflows and integration strategies across cloud and on-prem platforms
Collaborate with data scientists analysts and business stakeholders to understand data requirements
Ensure data quality integrity and lineage throughout the data lifecycle
Participate in performance tuning troubleshooting and production support
Contribute to best practices in data engineering including code versioning testing and CI/CD

Qualifications :

Required Qualifications:

Bachelors degree in Computer Science Data Engineering or related field
3 years of experience in data engineering with a focus on PySpark and Python
Strong hands-on experience with distributed data processing frameworks (e.g. Apache Spark)
Solid understanding of SQL data modeling and relational databases
Experience working with cloud platforms (e.g. AWS Azure GCP)
Familiarity with workflow orchestration tools (e.g. Airflow Azure Data Factory)

Preferred Qualifications:

Java experience for supporting hybrid data platforms and legacy integrations
Exposure to data lakes delta lakes and modern data architectures
Knowledge of containerization (Docker) Kubernetes and CI/CD pipelines
Familiarity with data governance security and compliance frameworks

Additional Information :

We believe in supporting our team professionally and personally.

OUR COMMITMENT TO DIVERSITY

At Sia we believe in fostering a diverse equitable and inclusive culture where our employees and partners are valued and thrive in a sense of belonging. We are committed to recruiting and developing a diverse network of employees and investing in their growth by providing unique opportunities for professional and cultural immersion. Our commitment toward inclusion motivates dynamic collaboration with our clients building trust by creating an inclusive environment of curiosity and learning which affects lasting impact.

Please visit our website for more information.

Sia is an equal opportunity employer. All aspects of employment including hiring promotion remuneration or discipline are based solely on performance competence conduct or business needs.

Remote Work :

Yes

Employment Type :

Full-time

Key Responsibilities:

Develop and optimize robust scalable data pipelines using PySpark and Python
Clean transform and enrich large-scale datasets from structured and unstructured sources
Implement data ingestion ETL/ELT workflows and integration strategies across cloud and on-prem platforms
Collaborate with data scientists analysts and business stakeholders to understand data requirements
Ensure data quality integrity and lineage throughout the data lifecycle
Participate in performance tuning troubleshooting and production support
Contribute to best practices in data engineering including code versioning testing and CI/CD

Qualifications :

Required Qualifications:

Bachelors degree in Computer Science Data Engineering or related field
3 years of experience in data engineering with a focus on PySpark and Python
Strong hands-on experience with distributed data processing frameworks (e.g. Apache Spark)
Solid understanding of SQL data modeling and relational databases
Experience working with cloud platforms (e.g. AWS Azure GCP)
Familiarity with workflow orchestration tools (e.g. Airflow Azure Data Factory)

Preferred Qualifications:

Java experience for supporting hybrid data platforms and legacy integrations
Exposure to data lakes delta lakes and modern data architectures
Knowledge of containerization (Docker) Kubernetes and CI/CD pipelines
Familiarity with data governance security and compliance frameworks

Additional Information :

We believe in supporting our team professionally and personally.

OUR COMMITMENT TO DIVERSITY

Please visit our website for more information.

Sia is an equal opportunity employer. All aspects of employment including hiring promotion remuneration or discipline are based solely on performance competence conduct or business needs.

Remote Work :

Yes

Employment Type :

Full-time

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

Sia

Sia est un groupe international de conseil en management de nouvelle génération. Fondé en 1999, nous sommes nés à l’ère du numérique. Aujourd’hui, nos services en stratégie et management sont augmentés par la data science, enrichis par la créativité et guidés par la responsabilité. No ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click