Mid/Senior Data Engineer

CodiLime

Posted on : 24-07-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Warszawa - Poland

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 24-07-2025

Job Description

Get to know us better

CodiLime is a software and network engineering industry expert and the first-choice service partner for top global networking hardware providers software providers and telecoms. We create proofs-of-concept help our clients build new products nurture existing ones and provide services in production environments. Our clients include both tech startups and big players in various industries and geographic locations (US Japan Israel Europe).

While no longer a startup - we have 250 people on board and have been operating since 2011 weve kept our people-oriented culture. Our values are simple:

Act to deliver.
Disrupt to grow.
Team up to win.

The project and the team

The goal of this project is to build a centralized large-scale business data platform for one of the biggest global consulting firms. The final dataset must be enterprise-grade providing consultants with reliable easily accessible information to help them quickly and effectively analyze company profiles during Mergers & Acquisitions (M&A) projects.

You will contribute to building data pipelines that ingest clean transform and integrate large datasets from over 10 different data sources resulting in a unified database with more than 300 million company records. The data must be accurate well-structured and optimized for low-latency querying. The platform will power several internal applications enabling a robust search experience across massive datasets and making your work directly impactful across the organization.

The data will provide firm-level and site-level information including firmographics technographics and hierarchical relationships (e.g. GU DU subsidiary site). This platform will serve as a key data backbone for consultants delivering critical metrics such as revenue CAGR EBITDA number of employees acquisitions divestitures competitors industry classification web traffic related brands and more.

Technology stack:

Languages: Python SQL
Data Stack: Snowflake DBT PostgreSQL Elasticsearch
Processing: Apache Spark on Azure Databricks
Workflow Orchestration: Apache Airflow
Cloud Platform: Microsoft Azure
- Compute / Orchestration: Azure Databricks (Spark clusters) Azure Kubernetes Service (AKS) Azure Functions Azure API Management.
- Database & Storage: Azure Database for PostgreSQL Azure Cosmos DB Azure Blob Storage
- Security & Configuration: Azure Key Vault Azure App Configuration Azure Container Registry (ACR)
- Search & Indexing: Azure AI Search

CI/CD: GitHub Actions
Static Code Analysis: SonarQube

AI Integration (Future Phase): Azure OpenAI

What else you should know:

Team Structure:

Data Architecture Lead
Data Engineers
Backend Engineers
DataOps Engineers
Product Owner

Work culture:

Agile collaborative and experienced work environment.
As this project will significantly impact the organization we expect a mature proactive and results-driven approach.
You will work with a distributed team across Europe and India.

We work on multiple interesting projects at the time so it may happen that well invite you to the interview for another project if we see that your competencies and profile are well suited for it.

Your role

As a part of the project team you will be responsible for:

Data Pipeline Development:

Designing building and maintaining scalable end-to-end data pipelines for ingesting cleaning transforming and integrating large structured and semi-structured datasets
Optimizing data collection processing and storage workflows
Conducting periodic data refresh processes (through data pipelines)
Building a robust ETL infrastructure using SQL technologies.
Assisting with data migration to the new platform
Automating manual workflows and optimizing data delivery

Data Transformation & Modeling:

Developing data transformation logic using SQL and DBT for Snowflake.
Designing and implementing scalable and high-performance data models.
Creating matching logic to deduplicate and connect entities across multiple sources.
Ensuring data quality consistency and performance to support downstream applications.

Workflow Orchestration:

Orchestrating data workflows using Apache Airflow running on Kubernetes.
Monitoring and troubleshooting data pipeline performance and operations.

Data Platform & Integration:

Enabling integration of 3rd-party and pre-cleaned data into a unified schema with rich metadata and hierarchical relationships.
Working with relational (Snowflake PostgreSQL) and non-relational (Elasticsearch) databases

Software Engineering & DevOps:

Writing data processing logic in Python.
Applying software engineering best practices: version control (Git) CI/CD pipelines (GitHub Actions) DevOps workflows.
Ensuring code quality using tools like SonarQube.
Documenting data processes and workflows.
Participating in code reviews

Future-Readiness & Integration:

Preparing the platform for future integrations (e.g. REST APIs LLM/agentic AI).
Leveraging Azure-native tools for secure and scalable data operations

Being proactive and motivated to deliver high-quality work

Communicating and collaborating effectively with other developers

Maintaining project documentation in Confluence.

Do we have a match

As a Data Engineer you must meet the following criteria:

Strong experience with Snowflake and DBT (must-have)
Experience with data processing frameworks such as Apache Spark (ideally on Azure Databricks)
Experience with orchestration tools like Apache Airflow Azure Data Factory (ADF) or similar
Experience with Docker Kubernetes and CI/CD practices for data workflows
Strong SQL skills including experience with query optimization
Experience in working with large-scale datasets
Very good understanding of data pipeline design concepts and approaches
Experience with data lake architectures for large-scale data processing and analytics
Very good coding skills in Python
- Writing clean scalable and testable code (unit tests)
- Understanding and applying object-oriented programming (OOP)
Experience with version control systems: Git
Good knowledge of English (minimum C1 level)

Beyond the criteria above we would appreciate the nice-to-haves:

Experience with PostgreSQL (ideally Azure Database for PostgreSQL)
Experience with GitHub Actions for CI/CD workflows
Experience with API Gateway FastAPI (REST async)
Experience with Azure AI Search or AWS OpenSearch
Familiarity with developing ETL/ELT processes (a plus)
Optional but valuable: familiarity with LLMs Azure OpenAI or Agentic AI system

More reasons to join us

Flexible working hours and approach to work: fully remotely in the office or hybrid
Professional growth supported by internal training sessions and a training budget
Solid onboarding with a hands-on approach to give you an easy start
A great atmosphere among professionals who are passionate about their work
The ability to change the project you work on

Employment Type

Full Time

Company Industry

Key Skills

Apply Now

About Company

CodiLime

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Mid/Senior Data Engineer

CodiLime

Job Description

Get to know us better

The project and the team

Your role

Do we have a match

More reasons to join us

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Lead Data Engineer Maroc

Senior Python Engineer

Data Engineer Procurement (f/m/d)

Senior Mechanical Design Engineer

Team Member - Part Time - Mid Week Availability Essential

Data Storage Solution Sales Manager

CONTABLE SENIOR

Senior Accounting