Manager Notes:
- This position will be offshore - India sitting prefers to be located in Bangalore if possible
- Reports to: Innovation team lead / DKB Platform Squad lead
- Duration: 6-month innovation fund engagement (potential conversion to permanent if live)
- What success looks like:
- 4 connectors in production all with integration tests and idempotent upserts
- Graph schema supporting 5 entity types and 5 relationship types
- Recursive CTE queries returning sub-second results for 3-hop traversals
- Dev/staging/prod environments operational with CI/CD
- Must haves;
- Full stack development Databricks data platform engineering Proficient SQL Data ingestion pipelines Python engineering IaC CI/CD.
What youll do:
- Design and build the DKB entity schema on Delta Lake (entity tables governed by Unity Catalog; edge tables deferred to Phase 2 Graph Extension)
- Build ingestion connectors for Unity Catalog metadata GitHub Confluence and dbt -- each producing a standard changeset with idempotent upserts
- Implement graph traversal queries using Databricks SQL recursive CTEs on Serverless SQL Warehouse
- Enrich entity metadata from Unity Catalog Lineage API (table-to-table column-to-column runtime lineage as entity properties)
- Build the Graph Repository abstraction layer (Python interface over Databricks SQL)
- Set up dev/staging/prod environments via UC catalog isolation Terraform (IaC) and Databricks Asset Bundles (deployment)
- Own data quality: Pydantic validation at ingestion boundary dead letter queues graph integrity tests connector integration tests with golden datasets
Must-have skills:
- 4 years with Databricks (Delta Lake Unity Catalog SQL Warehouse Jobs Compute)
- Strong SQL including recursive CTEs window functions complex joins
- Python for data engineering (Pydantic pytest API clients)
- Experience building data ingestion pipelines from REST APIs (GitHub Confluence Jira style)
- Comfortable with IaC (Terraform) and CI/CD for Databricks (Databricks Asset Bundles)
- Understanding of graph data modeling (entity-relationship property graphs) -- does not need to be a graph DB expert
Nice-to-have:
- Experience with Unity Catalog Lineage API or External Lineage
- Experience with dbt ( lineage metadata)
- Familiarity with Mosaic AI Vector Search
- Prior work with knowledge graphs or metadata platforms
Manager Notes: This position will be offshore - India sitting prefers to be located in Bangalore if possible Reports to: Innovation team lead / DKB Platform Squad lead Duration: 6-month innovation fund engagement (potential conversion to permanent if live) What success looks like: 4 connectors in ...
Manager Notes:
- This position will be offshore - India sitting prefers to be located in Bangalore if possible
- Reports to: Innovation team lead / DKB Platform Squad lead
- Duration: 6-month innovation fund engagement (potential conversion to permanent if live)
- What success looks like:
- 4 connectors in production all with integration tests and idempotent upserts
- Graph schema supporting 5 entity types and 5 relationship types
- Recursive CTE queries returning sub-second results for 3-hop traversals
- Dev/staging/prod environments operational with CI/CD
- Must haves;
- Full stack development Databricks data platform engineering Proficient SQL Data ingestion pipelines Python engineering IaC CI/CD.
What youll do:
- Design and build the DKB entity schema on Delta Lake (entity tables governed by Unity Catalog; edge tables deferred to Phase 2 Graph Extension)
- Build ingestion connectors for Unity Catalog metadata GitHub Confluence and dbt -- each producing a standard changeset with idempotent upserts
- Implement graph traversal queries using Databricks SQL recursive CTEs on Serverless SQL Warehouse
- Enrich entity metadata from Unity Catalog Lineage API (table-to-table column-to-column runtime lineage as entity properties)
- Build the Graph Repository abstraction layer (Python interface over Databricks SQL)
- Set up dev/staging/prod environments via UC catalog isolation Terraform (IaC) and Databricks Asset Bundles (deployment)
- Own data quality: Pydantic validation at ingestion boundary dead letter queues graph integrity tests connector integration tests with golden datasets
Must-have skills:
- 4 years with Databricks (Delta Lake Unity Catalog SQL Warehouse Jobs Compute)
- Strong SQL including recursive CTEs window functions complex joins
- Python for data engineering (Pydantic pytest API clients)
- Experience building data ingestion pipelines from REST APIs (GitHub Confluence Jira style)
- Comfortable with IaC (Terraform) and CI/CD for Databricks (Databricks Asset Bundles)
- Understanding of graph data modeling (entity-relationship property graphs) -- does not need to be a graph DB expert
Nice-to-have:
- Experience with Unity Catalog Lineage API or External Lineage
- Experience with dbt ( lineage metadata)
- Familiarity with Mosaic AI Vector Search
- Prior work with knowledge graphs or metadata platforms
View more
View less