Junior Data Engineer 1

Inetum

Job Location:

Warsaw - Poland

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

We are looking for a Junior Data Engineer to join us and contribute to the development of modern realtime data processing capabilities. You will help transition existing data and ML workflows from batch processing to scalable streaming solutions. The role involves handson engineering close collaboration with Data Scientists and operational responsibility for production data pipelines.

Technology Environment

Modern realtime data streaming technologies used for ML model inference
Distributed data processing frameworks supporting scalable lowlatency pipelines
Containerized workloads orchestrated in cloudnative environments
Monitoring and observability tools for ensuring reliability and performance of data pipelines
Pythonbased ecosystem supporting ML model integration and lifecycle management

Key Responsibilities

Transform batch inference workflows into streaming pipelines.
Define streaming semantics to replace batch windows including microbatching windowing and state management.
Design Kafka topic structures partitioning strategies and consumer group patterns for prediction workloads.
Implement checkpointing backpressure handling and deliveryguarantee strategies (atleastonce / exactlyonce).
Package and version ML model artifacts for streaming jobs supporting safe rollouts and rollbacks.
Tune performance for throughput and latency including batching strategies and resource allocation.
Deploy and operate streaming jobs with monitoring and alerting (lag throughput error rates).
Integrate streaming outputs into downstream ETL/BI systems.
Collaborate with Data Scientists on CI/CD for streaming models and monitor model performance/drift.

Team & Collaboration

You will work in a distributed delivery model closely aligned with the central AI/BI team in Germany.
Daily collaboration through MS Teams Jira Confluence.
Agile methodologies (Scrum/Kanban) in crossfunctional squads.

Qualifications :

Practical experience with Kafka (producers/consumers topic design partitions retention).
Experience with Spark Structured Streaming or similar streaming frameworks.
Familiarity with migrating batch inference to streaming architectures.
Experience running containerized workloads in Kubernetes.
Strong Python skills and understanding of common ML libraries.
English and Polish level B1 or higher.

Nice to have:

Basic monitoring/logging experience (ELK metrics) and performance tuning.
Experience with Kafka Streams.
Familiarity with feature stores or retraining orchestration.

Additional Information :

This position offers a hybrid work model. Office location: Warszawa Poznań Lublin

The position includes participation in an oncall duty.

We hereby inform you that Inetum Polska sp. z o.o. has implemented an internal reporting (whistleblowing) procedure. The content of the procedure and the possibility to submit an internal report are available at:

Work :

Employment Type :

Full-time

Technology Environment

Modern realtime data streaming technologies used for ML model inference
Distributed data processing frameworks supporting scalable lowlatency pipelines
Containerized workloads orchestrated in cloudnative environments
Monitoring and observability tools for ensuring reliability and performance of data pipelines
Pythonbased ecosystem supporting ML model integration and lifecycle management

Key Responsibilities

Transform batch inference workflows into streaming pipelines.
Define streaming semantics to replace batch windows including microbatching windowing and state management.
Design Kafka topic structures partitioning strategies and consumer group patterns for prediction workloads.
Implement checkpointing backpressure handling and deliveryguarantee strategies (atleastonce / exactlyonce).
Package and version ML model artifacts for streaming jobs supporting safe rollouts and rollbacks.
Tune performance for throughput and latency including batching strategies and resource allocation.
Deploy and operate streaming jobs with monitoring and alerting (lag throughput error rates).
Integrate streaming outputs into downstream ETL/BI systems.
Collaborate with Data Scientists on CI/CD for streaming models and monitor model performance/drift.

Team & Collaboration

You will work in a distributed delivery model closely aligned with the central AI/BI team in Germany.
Daily collaboration through MS Teams Jira Confluence.
Agile methodologies (Scrum/Kanban) in crossfunctional squads.

Qualifications :

Practical experience with Kafka (producers/consumers topic design partitions retention).
Experience with Spark Structured Streaming or similar streaming frameworks.
Familiarity with migrating batch inference to streaming architectures.
Experience running containerized workloads in Kubernetes.
Strong Python skills and understanding of common ML libraries.
English and Polish level B1 or higher.

Nice to have:

Basic monitoring/logging experience (ELK metrics) and performance tuning.
Experience with Kafka Streams.
Familiarity with feature stores or retraining orchestration.

Additional Information :

This position offers a hybrid work model. Office location: Warszawa Poznań Lublin

The position includes participation in an oncall duty.

Work :

Employment Type :

Full-time

Key Skills

Apply Now

About Company

Inetum

Inetum is a European leader in digital services. Inetums team of 28,000 consultants and specialists strive every day to make a digital impact for businesses, public sector entities and society. Inetums solutions aim at contributing to its clients performance and innovation as well ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click