Senior Data Engineer

RepRisk AG

Not Interested
Bookmark
Report This Job

profile Job Location:

Berlin - Germany

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

About You 

Are you looking for an opportunity to build robust scalable data infrastructure that powers meaningful cutting-edge machine learning projects Do you want to work at a company where your contributions have a real measurable impact - and youre recognized and rewarded for it 

If youre passionate about data architecture pipelines and enabling ethical tech development then this is the perfect role for you. We value autonomy giving you the space to bring innovative engineering solutions to life in an inclusive feedback-oriented environment. Your work will directly support NLP and machine learning initiatives that drive corporate responsibility through technology. 

Your Responsibilities 

As our new Senior Data Engineer you will architect build and scale a modern data platform leveraging Databricks and lakehouse architecture principles. You will lead the design and delivery of enterprise-grade data infrastructure as part of our global Technology division. You will also: 

  • Architect and implement end-to-end lakehouse solutions on Databricks leveraging Delta Lake Unity Catalog and the Medallion architecture (Bronze/Silver/Gold) 

  • Design build and maintain scalable reliable ELT pipelines using Databricks workflows Delta Live Tables and Apache Spark 

  • Develop and optimize high-throughput streaming and batch data pipelines using Spark Structured Streaming and Auto Loader 

  • Drive data platform performance tuning cost optimization and cluster/compute governance across Databricks environments 

  • Define and enforce data contracts schemas and governance standards through Unity Catalog and Delta Lake 

  • Ensure data quality observability and lineage across the platform using tools such as Databricks Data Observability and Great Expectations 

  • Collaborate cross-functionally with data scientists analysts and platform teams to deliver reliable self-serve data products 

  • Establish and champion internal data engineering best practices standards and reusable frameworks 

  • Stay current with the Databricks ecosystem lakehouse trends and emerging data engineering patterns 

  • Participate in code reviews to maintain high standards of quality performance and security 

  • Engage actively in Agile/Scrum ceremonies contributing architectural insights and technical direction to the team 


Qualifications :

You Offer 

  • A Bachelors Degree within subjects related to computer science or related STEM field

  • 5 years of hands-on experience in Data Engineering or similar role

  • Strong proficiency in Python and SQL 

  • Solid experience with Batch processing (e.g. AWS Glue / dbt) and stream processing technologies (e.g. Kafka)  

  • Proven experience with Dimensional Data Modelling and Data Vault methodologies 

  • Experience with Data Orchestration tools such as Airflow or Dagster   

  • Familiarity with data quality and validation frameworks (e.g. Great Expectations SODA or similar)

  • Experience integrating with Metadata tools such as Collibra OpenMetadata etc. 

  • Strong understanding of version control (Git) and CI/CD pipelines

  • Experience working with cloud platforms (AWS preferred) 

  • Practical experience with Data Lakehouse concepts and technologies such as Databricks and Snowflake

  • A proactive mindset with strong ownership initiative and drive to push things forward 

  • Strong communication skills with professional proficiency in English

Additionally the following are a plus  

  • Delivering workflow configurations in BPM based software such as Camunda etc. 

  • Experience working with Machine Learning teams familiarity with ML/DL/NLP concepts


Additional Information :

Please note that we will only consider candidates with a valid work permit


Remote Work :

No


Employment Type :

Full-time

About You Are you looking for an opportunity to build robust scalable data infrastructure that powers meaningful cutting-edge machine learning projects Do you want to work at a company where your contributions have a real measurable impact - and youre recognized and rewarded for it If youre passiona...
View more view more

Key Skills

  • Apache Hive
  • S3
  • Hadoop
  • Redshift
  • Spark
  • AWS
  • Apache Pig
  • NoSQL
  • Big Data
  • Data Warehouse
  • Kafka
  • Scala

About Company

About Us RepRisk is a rapidly growing global company and a pioneer in the ESG data science field. Our goal is to make the world a better place by creating transparency in the business world – we are driving positive change via the power of data. We combine AI and machine learning with ... View more

View Profile View Profile