Data Engineer

Inetum

Posted on : 30-05-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

Lisbon - Portugal

Monthly Salary

Not Disclosed

Salary Not Disclosed

Vacancy

1 Vacancy

Posted on : 30-05-2025

Job Description

The Data Engineer is responsible for building and maintaining the infrastructure that supports the organizations data architecture. The role involves creating and managing data pipelines using Airflow for data extraction processing and loading ensuring their maintenance monitoring and stability.

The engineer will work closely with data analysts and end-users to provide accessible and reliable data.

What we expect from the candidate

Candidate must be able to use Unix must know how to use Unix commands to check processes to read files processes and run bash commands. Candidate needs to know how to access a Unix server and perform commands there. If some process is not running needs to check the server to see what might be going on. For example if a Hadoop/yarn process is not running of if some container for Airflow is not up need to know how to investigate.
Candidate must know how to list Docker containers how to build Docker images how to change current images to add or remove things how to use and map volumes. Must know how to maintain and setup a distributed Airflow environment using Docker need to know how to build custom Docker images using Airflow image as base.
We strongly expect that the candidate knows Airflow knows Airflow components knows how to identify possible issues in the servers and fix them knows how to add more workers to the cluster. Need to make sure the containers are running fine in the servers and if any issue need to be able to fix.
Candidate must know how to maintain a Hadoop/Yarn cluster with Spark. Need to know which processes need to run in the servers how to set up the xml files for Hadoop and Yarn how to perform commands in HDFS. Need to be able to add a new worker in the Hadoop Cluster if necessary fix any possible issues in the servers. Need to know how to read the logs from Yarn and HDFS. Must know and understand how Spark works using Yarn as the resource manager.
Candidate must know how to develop in Python how to manage packages with pip review PRs from other people in the team and how to maintain and use a Flask API.
Candidate must know SQL how to run queries with CTEs window functions mainly Oracle database.

Main Tasks:

Responsible for maintaining the infrastructure that supports the current data architecture
Responsible for creating data pipelines in Airflow for data extracting processing and loading
Responsible for data pipelines maintenance monitoring and stability
Responsible for providing data access to data analysts and end-users
Responsible for DevOps infrastructure
Responsible for deploying Airflow dags to production environment using DevOps tools
Responsible for code and query optimization
Responsible for data pipelines maintenance monitoring and stability
Responsible for code review
Responsible for improving the current data architecture and DevOps processes
Responsible for delivering data in useful and appealing ways to users
Responsible for performing and documenting analysis review and study on specified regulatory topics
Responsible for understanding business change and requirement needs assess the impact and the cost.

Qualifications :

Technical Skills:

Python
Experience in creating APIs in Python
PySpark
Spark Environment Architecture
SQL Oracle Data Base
Experience in creating and maintaining distributed environments using Hadoop and Spark
Hadoop ecosystem - HDFS Yarn
Containerization - Docker is Mandatory
Data Lakes - Experience in organizing and maintaining data lakes - S3 is preferred
Experience with Parquet file format
Apache Airflow - Experience in both pipeline development and deploying Airflow in distributed environment
Apache Kakfa
Experience in automating applications deployment using DevOps tools - Jenkins is Mandatory Ansible is a plus

Language Skills

English

Remote Work :

Employment Type :

Full-time

Employment Type

Full-time

Company Industry

Key Skills

Apply Now

About Company

Inetum

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Data Engineer

Inetum

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Data Scientist (Insurance)

Data Analyst (All Genders)

Data Analyst (All Genders)

Data Analyst (All Genders)

Real Estate Data Entry Operator

24/7 Data Centre Technician

Risk Partner â Enterprise Data Risk

Resident Engineer