Lead Data Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Johannesburg - South Africa

profile Monthly Salary: R 650 - 700
Posted on: 3 hours ago
Vacancies: 1 Vacancy

Job Summary

Position: Lead Data Engineer

Contract Type: Fixed term / Contract

Contract Duration: Start Date: 25 May 2026 End Date: December 2026

Work Model: Hybrid (2-3 days a week)

Work Location: Sandton Johannesburg South Africa (Hybrid / Office-based as required)

Role Overview

We are seeking a Lead / Senior Data Engineer to design build and operate modern Databricks and Lakehouse data platforms that support advanced analytics AI and Generative AI use cases.

This role is a senior individual contributor position operating within product-aligned crossfunctional squads. The successful candidate will deliver high-quality governed scalable data assets consumed by analytics platforms machine learning models and Generative AI solutions including LLM- and agent-based systems.

Key Responsibilities

1. Databricks & Data Platform Engineering

Design build and operate data solutions using Databricks including:

  • Delta Lake
  • Databricks Jobs and Workflows
  • Unity Catalog
  • Notebooks and shared libraries
  • Develop scalable reliable Lakehouse architectures supporting analytics and AI workloads.

2. Data Enablement & Consumption

Enable data consumption for:

  • Generative AI use cases (e.g. Retrieval-Augmented Generation AI services agent workflows)
  • Analytics and reporting platforms
  • Downstream operational and business systems
  • Support feature-style and curated data access patterns required by AI and GenAI workloads.

3. Generative AI Data Enablement

Build and maintain data pipelines that feed Generative AI applications including:

  • Curated knowledge and reference datasets
  • Structured and semi-structured data sources
  • Metadata lineage and traceability for AI consumption
  • Enable common GenAI data patterns such as:
  • Retrieval Augmented Generation (RAG)
  • Contextual and prompt data preparation
  • Model input output and feedback data flows

4. Engineering Standards & Best Practices

Develop production-grade data pipelines using:

  • Python
  • SQL
  • Apache Spark
  • Implement automated testing CI/CD and deployment practices for data workloads.
  • Ensure data solutions are:
  • Observable
  • Resilient
  • Performant
  • Cost-efficient
  • Continuously improve data quality reliability and operational stability.

5. Collaboration & Ways of Working

  • Act as a senior engineer within a cross-functional product squad.
  • Collaborate closely with:
  • Product Owners
  • AI / Machine Learning Engineers
  • Analytics teams
  • Platform and security teams
  • Provide engineering input into design discussions and delivery decisions.
  • Support peer reviews and contribute to shared engineering standards.
  • Provide mentorship and technical guidance including involvement in AI Engineer development.

6. Risk Governance & Run

  • Ensure all data solutions comply with enterprise security risk and governance standards.
  • Support the operational stability of data pipelines used by analytics and AI workloads.
  • Participate in incident resolution and root cause analysis.
  • Maintain appropriate technical documentation and runbooks.

Required Background & Experience:

  • 1015 years of industry experience in data engineering or related fields.
  • 5 years operating as a Senior or Lead Data Engineer.
  • Mandatory Technical Skills (with minimum experience)
  • Databricks (hands-on): 2 years
  • Enterprise data lake / lakehouse architecture: 5 years
  • Python: 5 years
  • SQL: 5 years
  • Apache Spark: 5 years
  • Production-grade data platforms: 3 years
  • Enterprise or regulated environments: 5 years

Mandatory Skills Summary:

  • Databricks
  • Data lake and lakehouse architecture
  • Python
  • SQL
  • Apache Spark
  • Production-grade data platforms
  • Enterprise or regulated environments

Desirable / Beneficial Skills:

  • Experience enabling AI ML or Generative AI use cases from a data engineering perspective

Familiarity with:

  • RAG data patterns
  • Feature-style or AI-serving datasets
  • Vector-based or embedding-ready data workflows
  • Experience working in Agile product-aligned squads
  • Exposure to cloud-native data platforms such as AWS or Azure

Desired Skills Summary:

  • AI ML or Generative AI
  • RAG data patterns
  • Feature-style or AI-serving datasets
  • Vector or embedding-ready data workflows
  • Cloud-native data platforms (AWS or Azure)
Position: Lead Data Engineer Contract Type: Fixed term / Contract Contract Duration: Start Date: 25 May 2026 End Date: December 2026 Work Model: Hybrid (2-3 days a week) Work Location: Sandton Johannesburg South Africa (Hybrid / Office-based as required) Role Overview We are seeking a Lead / Senior...
View more view more