Data Engineer

Lisbon - Portugal

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

The Data Engineer will be responsible for designing developing and maintaining scalable and reliable data pipelines for a financial services project. The role focuses on backend data processing data quality and integration of multiple data sources in a cloud-based environment working closely with international teams.

Key Responsibilities

Design develop and maintain end-to-end ETL/ELT data pipelines to process large volumes of structured and semi-structured data.
Implement backend data solutions using Python and SQL applying Object-Oriented Programming (OOP) to ensure modularity reusability and maintainability.
Orchestrate data workflows using Apache Airflow including scheduling monitoring and failure handling.
Process and transform large datasets using PySpark in distributed environments.
Integrate data from multiple sources including APIs relational databases and cloud storage systems.
Manage and utilize AWS S3 for data storage and data lake architectures.
Apply data quality checks validation rules and deduplication logic to ensure data consistency and accuracy.
Develop maintain and support CI/CD pipelines using Bitbucket ensuring controlled deployments versioning and code quality.
Collaborate with cross-functional and international teams contributing to technical discussions and documentation in English.
Support downstream data consumers by ensuring datasets are well-structured documented and ready for analytics or reporting.
Troubleshoot and resolve data pipeline issues performance bottlenecks and data inconsistencies.

Qualifications :

Programming Languages: Python SQL
Programming Paradigms: Object-Oriented Programming (OOP)
Data Processing: PySpark
Orchestration: Apache Airflow
CI/CD: Bitbucket
Cloud & Storage: AWS (S3)
Data Sources: APIs relational databases parquet files
Data Architecture: ETL/ELT pipelines data lakes

Required Skills & Experience

Strong experience in data engineering and backend data development.
Solid knowledge of Python and SQL with practical application of OOP principles.
Experience building and maintaining production-grade ETL/ELT pipelines.
Hands-on experience with Apache Airflow for workflow orchestration.
Experience with CI/CD practices
Experience working with distributed data processing frameworks such as Spark / PySpark.
Familiarity with cloud-based data platforms preferably AWS.
Ability to work autonomously while collaborating with remote international teams.
Professional working proficiency in English.

Nice to Have

Experience in financial services or regulated environments.
Familiarity with data quality frameworks monitoring or observability tools.
Exposure to Oracle Apex.
Experience working in agile and/or DevOps-oriented teams.

Additional Information :

The candidate is expected to work in a Hybrid model 50/50 frame work.

Remote Work :

Employment Type :

Full-time

Key Responsibilities

Design develop and maintain end-to-end ETL/ELT data pipelines to process large volumes of structured and semi-structured data.
Implement backend data solutions using Python and SQL applying Object-Oriented Programming (OOP) to ensure modularity reusability and maintainability.
Orchestrate data workflows using Apache Airflow including scheduling monitoring and failure handling.
Process and transform large datasets using PySpark in distributed environments.
Integrate data from multiple sources including APIs relational databases and cloud storage systems.
Manage and utilize AWS S3 for data storage and data lake architectures.
Apply data quality checks validation rules and deduplication logic to ensure data consistency and accuracy.
Develop maintain and support CI/CD pipelines using Bitbucket ensuring controlled deployments versioning and code quality.
Collaborate with cross-functional and international teams contributing to technical discussions and documentation in English.
Support downstream data consumers by ensuring datasets are well-structured documented and ready for analytics or reporting.
Troubleshoot and resolve data pipeline issues performance bottlenecks and data inconsistencies.

Qualifications :

Programming Languages: Python SQL
Programming Paradigms: Object-Oriented Programming (OOP)
Data Processing: PySpark
Orchestration: Apache Airflow
CI/CD: Bitbucket
Cloud & Storage: AWS (S3)
Data Sources: APIs relational databases parquet files
Data Architecture: ETL/ELT pipelines data lakes

Required Skills & Experience

Strong experience in data engineering and backend data development.
Solid knowledge of Python and SQL with practical application of OOP principles.
Experience building and maintaining production-grade ETL/ELT pipelines.
Hands-on experience with Apache Airflow for workflow orchestration.
Experience with CI/CD practices
Experience working with distributed data processing frameworks such as Spark / PySpark.
Familiarity with cloud-based data platforms preferably AWS.
Ability to work autonomously while collaborating with remote international teams.
Professional working proficiency in English.

Nice to Have

Experience in financial services or regulated environments.
Familiarity with data quality frameworks monitoring or observability tools.
Exposure to Oracle Apex.
Experience working in agile and/or DevOps-oriented teams.

Additional Information :

The candidate is expected to work in a Hybrid model 50/50 frame work.

Remote Work :

Employment Type :

Full-time

Key Skills

Apache Hive
S3
Hadoop
Redshift
Spark
AWS
Apache Pig
NoSQL
Big Data
Data Warehouse
Kafka
Scala

Apply Now

About Company

Inetum

Inetum is a European leader in digital services. Inetums team of 28,000 consultants and specialists strive every day to make a digital impact for businesses, public sector entities and society. Inetums solutions aim at contributing to its clients performance and innovation as well ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click