DE&A Core Project Management Project Management

Zensar Technologies

Job Location:

Pune - India

Monthly Salary: Not Disclosed

Posted on: Yesterday

Vacancies: 1 Vacancy

Job Summary

Description

Role Overview

You will own the end-to-end design and delivery of our Master Data Management (MDM) framework Data Quality & Governance (DQG) pipelines and enterprise Data Catalog. Working closely with product analytics and platform engineering teams you will transform fragmented data assets into trusted AI-ready data products enabling everything from self-serve analytics to real-time ML inference. This is a senior individual contributor role with a clear path to staff/principal and early exposure to AI-augmented data management tooling.

Key Responsibilities

1. Master Data Management (MDM)

Design and implement a scalable MDM architecture covering customer product and entity master domains.
Build and maintain golden record pipelines using entity resolution probabilistic matching and survivorship rules.
Leverage Neo4j graph models to represent complex entity relationships and hierarchies that RDBMS cannot capture.
Drive cross-functional data stewardship workflows from source profiling to master record certification.

2. Data Quality & Governance (DQG)

Establish and operationalise a DQ framework: define critical data elements (CDEs) quality dimensions and SLA thresholds.
Build automated DQ checks (completeness uniqueness validity timeliness) integrated into CI/CD pipelines.
Instrument data observability tooling (Monte Carlo Soda Core or equivalent) to detect and alert on anomalies in real time.
Develop and maintain data governance policies in alignment with GDPR CCPA and ISO 8000 standards.
Produce executive-facing data quality scorecards and lineage dashboards.

3. Data Catalog & Metadata Management

Own the enterprise data catalog (Collibra Alation or open-source equivalent) including taxonomy glossary and ownership models.
Implement active metadata strategies: auto-tagging lineage capture and semantic enrichment using LLMs.
Drive catalog adoption across engineering analytics and business teams through self-serve onboarding.
Integrate catalog metadata with downstream AI feature stores and ML pipelines to ensure feature provenance and reusability.

4. AI-Augmented Data Management

Apply ML and LLM techniques to automate DQ rule generation anomaly detection and metadata enrichment.
Build AI-powered entity resolution models (embeddings graph algorithms) to replace rule-based matching.
Collaborate with data science teams to deliver clean governed AI-ready datasets for model training and inference.
Evaluate and pilot emerging AI data tools (e.g. DataHub AI OpenMetadata custom RAG pipelines over catalog metadata).
Contribute to the internal AI data platform roadmap helping define the standards for how AI models consume governed data.

5. Platform Engineering & Delivery

Design reusable data pipeline patterns on cloud-native stacks (AWS/GCP/Azure) using Spark dbt Airflow or equivalent.
Mentor junior engineers; conduct design reviews and enforce engineering best practices.
Partner with data product owners to define and deliver certified data products on a data mesh architecture.

Required Qualifications

8 years in data engineering with demonstrable depth in at least two of: MDM DQG or Data Catalog implementation.
Hands-on experience with at least one enterprise MDM platform (Informatica MDM Reltio Semarchy or custom-built).
Proficiency in SQL Python and Spark for large-scale data processing.
Strong understanding of data governance frameworks (DAMA-DMBOK DCAM or equivalent).
Experience with a Data Catalog platform at production scale (Collibra Alation Atlan DataHub or OpenMetadata).
Working knowledge of graph data concepts Neo4j Cypher experience is a strong advantage.
Hands-on exposure to ML/AI tooling: model training feature engineering or LLM-based automation.
Experience operating in cloud environments (AWS Glue GCP Dataplex Azure Purview or equivalent).
Strong communication skills able to translate data governance concepts for non-technical stakeholders.

Preferred Qualifications

Neo4j Certified Professional or equivalent graph database certification.
Experience with data mesh principles and building certified data products.
Familiarity with vector databases and embedding-based search (Pinecone Weaviate or pgvector).
Contributions to open-source data governance or catalog projects.
Background in financial services healthcare or other regulated industries with stringent data compliance requirements.
Experience with real-time streaming data quality (Kafka Great Expectations / Soda).

Responsibilities

MDMDQG Data Catalog person with strong knowledge on AI

Qualifications

MDMDQG Data Catalog person with strong knowledge on AI

DescriptionRole OverviewYou will own the end-to-end design and delivery of our Master Data Management (MDM) framework Data Quality & Governance (DQG) pipelines and enterprise Data Catalog. Working closely with product analytics and platform engineering teams you will transform fragmented data assets...

Description

Role Overview

Key Responsibilities

1. Master Data Management (MDM)

Design and implement a scalable MDM architecture covering customer product and entity master domains.
Build and maintain golden record pipelines using entity resolution probabilistic matching and survivorship rules.
Leverage Neo4j graph models to represent complex entity relationships and hierarchies that RDBMS cannot capture.
Drive cross-functional data stewardship workflows from source profiling to master record certification.

2. Data Quality & Governance (DQG)

Establish and operationalise a DQ framework: define critical data elements (CDEs) quality dimensions and SLA thresholds.
Build automated DQ checks (completeness uniqueness validity timeliness) integrated into CI/CD pipelines.
Instrument data observability tooling (Monte Carlo Soda Core or equivalent) to detect and alert on anomalies in real time.
Develop and maintain data governance policies in alignment with GDPR CCPA and ISO 8000 standards.
Produce executive-facing data quality scorecards and lineage dashboards.

3. Data Catalog & Metadata Management

Own the enterprise data catalog (Collibra Alation or open-source equivalent) including taxonomy glossary and ownership models.
Implement active metadata strategies: auto-tagging lineage capture and semantic enrichment using LLMs.
Drive catalog adoption across engineering analytics and business teams through self-serve onboarding.
Integrate catalog metadata with downstream AI feature stores and ML pipelines to ensure feature provenance and reusability.

4. AI-Augmented Data Management

Apply ML and LLM techniques to automate DQ rule generation anomaly detection and metadata enrichment.
Build AI-powered entity resolution models (embeddings graph algorithms) to replace rule-based matching.
Collaborate with data science teams to deliver clean governed AI-ready datasets for model training and inference.
Evaluate and pilot emerging AI data tools (e.g. DataHub AI OpenMetadata custom RAG pipelines over catalog metadata).
Contribute to the internal AI data platform roadmap helping define the standards for how AI models consume governed data.

5. Platform Engineering & Delivery

Design reusable data pipeline patterns on cloud-native stacks (AWS/GCP/Azure) using Spark dbt Airflow or equivalent.
Mentor junior engineers; conduct design reviews and enforce engineering best practices.
Partner with data product owners to define and deliver certified data products on a data mesh architecture.

Required Qualifications

8 years in data engineering with demonstrable depth in at least two of: MDM DQG or Data Catalog implementation.
Hands-on experience with at least one enterprise MDM platform (Informatica MDM Reltio Semarchy or custom-built).
Proficiency in SQL Python and Spark for large-scale data processing.
Strong understanding of data governance frameworks (DAMA-DMBOK DCAM or equivalent).
Experience with a Data Catalog platform at production scale (Collibra Alation Atlan DataHub or OpenMetadata).
Working knowledge of graph data concepts Neo4j Cypher experience is a strong advantage.
Hands-on exposure to ML/AI tooling: model training feature engineering or LLM-based automation.
Experience operating in cloud environments (AWS Glue GCP Dataplex Azure Purview or equivalent).
Strong communication skills able to translate data governance concepts for non-technical stakeholders.

Preferred Qualifications

Neo4j Certified Professional or equivalent graph database certification.
Experience with data mesh principles and building certified data products.
Familiarity with vector databases and embedding-based search (Pinecone Weaviate or pgvector).
Contributions to open-source data governance or catalog projects.
Background in financial services healthcare or other regulated industries with stringent data compliance requirements.
Experience with real-time streaming data quality (Kafka Great Expectations / Soda).

Responsibilities

MDMDQG Data Catalog person with strong knowledge on AI

Qualifications

MDMDQG Data Catalog person with strong knowledge on AI

Apply Now

About Company

Zensar Technologies

At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better f ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

DE&A Core Project Management Project Management

Pune - India

Job Summary

Role Overview

Key Responsibilities

Required Qualifications

Preferred Qualifications

Role Overview

Key Responsibilities

Required Qualifications

Preferred Qualifications

About Company

Related Jobs