Data Engineer (Azure DE, PySpark)

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Job Description:

Design and implement robust scalable data engineering solutions using Python and PySpark for data pipelines and ETL processes.
Architect well-structured Python projects: use appropriate directory structures (e.g. src/ tests/ config files) adhere to clean code and clean architecture principles.
Write and maintain comprehensive unit tests and integration tests with pytest (and related tooling-tox pre-commit).
Apply sound software design principles (SOLID/OOP modularization maintainability clear separation of concerns).
Use and enforce version control best practices (branching PRs code review) and continuous integration (CI/CD) for automated testing and deployment.
Develop configure and maintain virtual environments using tools like venv conda pyenv or poetry; ensure dependency reproducibility and isolation.
Containerize applications and workflows with Docker or similar including writing concise Dockerfiles and ensuring environment reproducibility.
Collaborate with data science and analytics teams to structure test and deploy ML pipelines with proper test coverage and code structure.
Write clear maintainable documentation (README inline docs docstrings).
Contribute to MLOps/DevOps workflows: automating model training deployment performance monitoring and managing reproducible pipelines.

Job Description: Design and implement robust scalable data engineering solutions using Python and PySpark for data pipelines and ETL processes. Architect well-structured Python projects: use appropriate directory structures (e.g. src/ tests/ config files) adhere to clean code and clean arch...