Data Flow Engineer
Job Summary
- Design develop and maintain complex data flows within Cloudera DataFlow (Apache NiFi) ensuring scalable reliable and high-performance data movement across systems.
- Develop and optimize real-time and near real-time data pipelines leveraging NiFi Kafka and CDC technologies (e.g. Debezium SQL-based connectors).
- Implement integrations with internal and external systems using REST APIs JDBC Kafka and other communication protocols ensuring secure and resilient data exchange.
- Design and manage data schemas (Avro) metadata and lineage using Apache Atlas ensuring full traceability and governance of data flows.
- Define and enforce data security and access control policies using Apache Ranger in alignment with enterprise governance frameworks.
- Monitor troubleshoot and optimize data pipelines for performance reliability and scalability including proactive alerting and issue resolution.
- Collaborate with data engineers architects and business stakeholders to define requirements design architectures and deliver robust data flow solutions.
- Create and maintain technical documentation SOPs and runbooks for operational support and knowledge sharing.
- Support platform lifecycle activities including upgrades migrations and enhancements across CDP NiFi and Kafka environments.
- Perform other related duties as assigned by the team leader.
Qualifications :
- Advanced university degree (Masters or equivalent) in computer science information systems data engineering or a related field; a first-level degree combined with additional experience may be accepted in lieu of the advanced degree.
- At least one of the following certifications:
- Cloudera Certified Developer for Apache NiFi (or equivalent)
- Cloudera DataFlow (CFM) certification (or equivalent)
Equivalent certifications must be internationally recognized and accepted as valid credentials.
- Minimum 23 years of hands-on experience working with Apache NiFi preferably within the Cloudera Data Platform (CDP) environment including flow design deployment monitoring and troubleshooting.
- Proven experience delivering at least one large-scale integration project using NiFi as a core technology (API integrations database connectivity transformation routing and delivery).
- Expert knowledge in designing implementing and maintaining complex data flows using Apache NiFi / Cloudera DataFlow.
- Advanced Python programming skills for data processing automation and custom flow development.
- Strong experience in building and integrating REST APIs including authentication (OAuth/JWT) rate limiting and error handling strategies.
- Hands-on experience with CDC (Change Data Capture) approaches using NiFi processors/connectors and SQL-based methods.
- Practical experience with Apache Iceberg including table design schema evolution partitioning and integration with processing engines (e.g. Spark Flink).
- Solid knowledge of data governance and catalog tools within CDP including Apache Atlas (metadata lineage tagging) and Apache Ranger (security policies authorization).
- Experience working with Apache Kafka as a messaging platform including topics producers/consumers schema management and NiFi integration.
- Good understanding of data serialization using Apache Avro including schema evolution and compatibility principles.
- Strong analytical and problem-solving skills with the ability to diagnose and resolve complex data pipeline issues.
- Excellent communication and collaboration skills with the ability to work effectively in cross-functional teams.
- Fluency in written and spoken English.
Remote Work :
No
Employment Type :
Full-time
About Company
Ar?s is a fully independent group of companies specialized in managing complex IT projects and systems for large organisations, focusing on state-of-the-art software development, business intelligence and infrastructure services. We are composed of 17 entities across 9 countries that ... View more