Job Description:
-
Design and implement robust scalable data engineering solutions using Python and PySpark for data pipelines and ETL processes.
-
Architect well-structured Python projects: use appropriate directory structures (e.g. src/ tests/ config files) adhere to clean code and clean architecture principles.
-
Write and maintain comprehensive unit tests and integration tests with pytest (and related tooling-tox pre-commit).
-
Apply sound software design principles (SOLID/OOP modularization maintainability clear separation of concerns).
-
Use and enforce version control best practices (branching PRs code review) and continuous integration (CI/CD) for automated testing and deployment.
-
Develop configure and maintain virtual environments using tools like venv conda pyenv or poetry; ensure dependency reproducibility and isolation.
-
Containerize applications and workflows with Docker or similar including writing concise Dockerfiles and ensuring environment reproducibility.
-
Collaborate with data science and analytics teams to structure test and deploy ML pipelines with proper test coverage and code structure.
-
Write clear maintainable documentation (README inline docs docstrings).
-
Contribute to MLOps/DevOps workflows: automating model training deployment performance monitoring and managing reproducible pipelines.
Job Description: Design and implement robust scalable data engineering solutions using Python and PySpark for data pipelines and ETL processes. Architect well-structured Python projects: use appropriate directory structures (e.g. src/ tests/ config files) adhere to clean code and clean arch...
Job Description:
-
Design and implement robust scalable data engineering solutions using Python and PySpark for data pipelines and ETL processes.
-
Architect well-structured Python projects: use appropriate directory structures (e.g. src/ tests/ config files) adhere to clean code and clean architecture principles.
-
Write and maintain comprehensive unit tests and integration tests with pytest (and related tooling-tox pre-commit).
-
Apply sound software design principles (SOLID/OOP modularization maintainability clear separation of concerns).
-
Use and enforce version control best practices (branching PRs code review) and continuous integration (CI/CD) for automated testing and deployment.
-
Develop configure and maintain virtual environments using tools like venv conda pyenv or poetry; ensure dependency reproducibility and isolation.
-
Containerize applications and workflows with Docker or similar including writing concise Dockerfiles and ensuring environment reproducibility.
-
Collaborate with data science and analytics teams to structure test and deploy ML pipelines with proper test coverage and code structure.
-
Write clear maintainable documentation (README inline docs docstrings).
-
Contribute to MLOps/DevOps workflows: automating model training deployment performance monitoring and managing reproducible pipelines.
View more
View less