Data Flow Engineer

ARHS

Not Interested
Bookmark
Report This Job

profile Job Location:

Warsaw - Poland

profile Monthly Salary: Not Disclosed
Posted on: 12 hours ago
Vacancies: 1 Vacancy

Job Summary

  • Design develop and maintain complex data flows within Cloudera DataFlow (Apache NiFi) ensuring scalable reliable and high-performance data movement across systems.
  • Develop and optimize real-time and near real-time data pipelines leveraging NiFi Kafka and CDC technologies (e.g. Debezium SQL-based connectors).
  • Implement integrations with internal and external systems using REST APIs JDBC Kafka and other communication protocols ensuring secure and resilient data exchange.
  • Design and manage data schemas (Avro) metadata and lineage using Apache Atlas ensuring full traceability and governance of data flows.
  • Define and enforce data security and access control policies using Apache Ranger in alignment with enterprise governance frameworks.
  • Monitor troubleshoot and optimize data pipelines for performance reliability and scalability including proactive alerting and issue resolution.
  • Collaborate with data engineers architects and business stakeholders to define requirements design architectures and deliver robust data flow solutions.
  • Create and maintain technical documentation SOPs and runbooks for operational support and knowledge sharing.
  • Support platform lifecycle activities including upgrades migrations and enhancements across CDP NiFi and Kafka environments.
  • Perform other related duties as assigned by the team leader.

Qualifications :

  • Advanced university degree (Masters or equivalent) in computer science information systems data engineering or a related field; a first-level degree combined with additional experience may be accepted in lieu of the advanced degree.
  • At least one of the following certifications:
  1. Cloudera Certified Developer for Apache NiFi (or equivalent)
  2. Cloudera DataFlow (CFM) certification (or equivalent)
    Equivalent certifications must be internationally recognized and accepted as valid credentials.
  • Minimum 23 years of hands-on experience working with Apache NiFi preferably within the Cloudera Data Platform (CDP) environment including flow design deployment monitoring and troubleshooting.
  • Proven experience delivering at least one large-scale integration project using NiFi as a core technology (API integrations database connectivity transformation routing and delivery).
  • Expert knowledge in designing implementing and maintaining complex data flows using Apache NiFi / Cloudera DataFlow.
  • Advanced Python programming skills for data processing automation and custom flow development.
  • Strong experience in building and integrating REST APIs including authentication (OAuth/JWT) rate limiting and error handling strategies.
  • Hands-on experience with CDC (Change Data Capture) approaches using NiFi processors/connectors and SQL-based methods.
  • Practical experience with Apache Iceberg including table design schema evolution partitioning and integration with processing engines (e.g. Spark Flink).
  • Solid knowledge of data governance and catalog tools within CDP including Apache Atlas (metadata lineage tagging) and Apache Ranger (security policies authorization).
  • Experience working with Apache Kafka as a messaging platform including topics producers/consumers schema management and NiFi integration.
  • Good understanding of data serialization using Apache Avro including schema evolution and compatibility principles.
  • Strong analytical and problem-solving skills with the ability to diagnose and resolve complex data pipeline issues.
  • Excellent communication and collaboration skills with the ability to work effectively in cross-functional teams.
  • Fluency in written and spoken English.

Remote Work :

No


Employment Type :

Full-time

Design develop and maintain complex data flows within Cloudera DataFlow (Apache NiFi) ensuring scalable reliable and high-performance data movement across systems.Develop and optimize real-time and near real-time data pipelines leveraging NiFi Kafka and CDC technologies (e.g. Debezium SQL-based conn...
View more view more

About Company

Company Logo

Ar?s is a fully independent group of companies specialized in managing complex IT projects and systems for large organisations, focusing on state-of-the-art software development, business intelligence and infrastructure services. We are composed of 17 entities across 9 countries that ... View more

View Profile View Profile