Senior Data Engineer

Warsaw - Poland

Monthly Salary: Not Disclosed

Posted on: 6 hours ago

Vacancies: 1 Vacancy

Job Summary

Data Engineering & Pipeline Development

Design build and maintain scalable and reliable data pipelines (batch and streaming).
Ingest and integrate data from multiple sources (SQL/NoSQL databases APIs files cloud services).
Develop and maintain efficient ETL/ELT processes and data workflows.
Ensure data quality integrity and availability across the data lifecycle.
Optimize data storage and processing for performance and cost efficiency.

Data Platform & Architecture

Design and maintain modern data architectures (Data Lake Data Warehouse Lakehouse).
Implement scalable data models to support analytics and operational use cases.
Manage orchestration scheduling and monitoring of data pipelines.
Maintain and improve cloud-based data infrastructure.
Apply data governance practices version control and technical documentation standards.

AI/ML & Data Infrastructure Support

Build and maintain data pipelines that support AI and Machine Learning use cases.
Prepare curated datasets and feature-ready data for Data Science teams.
Implement ingestion and processing pipelines for LLM-based applications.
Manage embedding pipelines and integrations with vector databases.
Support RAG architectures from a data engineering and infrastructure perspective.

Monitoring Reliability & Performance

Implement monitoring alerting and observability for data pipelines and workflows.
Detect and resolve data quality issues pipeline failures and performance bottlenecks.
Optimize queries data models and processing jobs to improve scalability and reliability.

Qualifications :

Requirements

Strong experience with Python focused on data processing and pipeline development.
Advanced SQL skills and solid understanding of data modeling concepts.
Hands-on experience with:
- ETL/ELT frameworks
- Workflow orchestration tools (e.g. Airflow Prefect Dagster or similar)
- Distributed data processing frameworks (e.g. Apache Spark or similar)
Experience working with cloud platforms especially:
- Microsoft Azure (e.g. Data Factory Synapse Fabric Azure AI Foundry)
- Google Cloud (e.g. BigQuery Dataflow Cloud Composer or similar services)
Experience supporting LLM-based data infrastructure including:
- Vector databases
- Embedding pipelines
- Integration with frameworks such as LangChain (from a data engineering perspective)
Familiarity with BI tools and supporting analytical data models (Power BI Tableau Qlik).
Strong communication skills and ability to work in cross-functional teams.
High level of English and Polish is a must.

Additional Information :

What do we offer you

This is a hybrid position based in Warsaw Poland.
Full-time contract.
Smart Office Pack so that you can work comfortably from home.
Training and career development.
Benefits and perks such as private medical insurance life insurance.
Possibility to be part of a multicultural team and work on international projects.
Possibility to manage work-permits.

If you are passionate about data development & tech we want to meet you!

Remote Work :

Employment Type :

Full-time

Data Engineering & Pipeline DevelopmentDesign build and maintain scalable and reliable data pipelines (batch and streaming).Ingest and integrate data from multiple sources (SQL/NoSQL databases APIs files cloud services).Develop and maintain efficient ETL/ELT processes and data workflows.Ensure data ...

Data Engineering & Pipeline Development

Design build and maintain scalable and reliable data pipelines (batch and streaming).
Ingest and integrate data from multiple sources (SQL/NoSQL databases APIs files cloud services).
Develop and maintain efficient ETL/ELT processes and data workflows.
Ensure data quality integrity and availability across the data lifecycle.
Optimize data storage and processing for performance and cost efficiency.

Data Platform & Architecture

Design and maintain modern data architectures (Data Lake Data Warehouse Lakehouse).
Implement scalable data models to support analytics and operational use cases.
Manage orchestration scheduling and monitoring of data pipelines.
Maintain and improve cloud-based data infrastructure.
Apply data governance practices version control and technical documentation standards.

AI/ML & Data Infrastructure Support

Build and maintain data pipelines that support AI and Machine Learning use cases.
Prepare curated datasets and feature-ready data for Data Science teams.
Implement ingestion and processing pipelines for LLM-based applications.
Manage embedding pipelines and integrations with vector databases.
Support RAG architectures from a data engineering and infrastructure perspective.

Monitoring Reliability & Performance

Implement monitoring alerting and observability for data pipelines and workflows.
Detect and resolve data quality issues pipeline failures and performance bottlenecks.
Optimize queries data models and processing jobs to improve scalability and reliability.

Qualifications :

Requirements

Strong experience with Python focused on data processing and pipeline development.
Advanced SQL skills and solid understanding of data modeling concepts.
Hands-on experience with:
- ETL/ELT frameworks
- Workflow orchestration tools (e.g. Airflow Prefect Dagster or similar)
- Distributed data processing frameworks (e.g. Apache Spark or similar)
Experience working with cloud platforms especially:
- Microsoft Azure (e.g. Data Factory Synapse Fabric Azure AI Foundry)
- Google Cloud (e.g. BigQuery Dataflow Cloud Composer or similar services)
Experience supporting LLM-based data infrastructure including:
- Vector databases
- Embedding pipelines
- Integration with frameworks such as LangChain (from a data engineering perspective)
Familiarity with BI tools and supporting analytical data models (Power BI Tableau Qlik).
Strong communication skills and ability to work in cross-functional teams.
High level of English and Polish is a must.

Additional Information :

What do we offer you

This is a hybrid position based in Warsaw Poland.
Full-time contract.
Smart Office Pack so that you can work comfortably from home.
Training and career development.
Benefits and perks such as private medical insurance life insurance.
Possibility to be part of a multicultural team and work on international projects.
Possibility to manage work-permits.

If you are passionate about data development & tech we want to meet you!

Remote Work :

Employment Type :

Full-time

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

Talan

Talan is an international consulting and technology expertise group that accelerates the transformation of its clients by leveraging innovation, technology, and data. For over 20 years, Talan has been advising and supporting businesses and public institutions in the implementation of ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click