DE&A Core Project Management Project Management


Job Location:

Pune - India

Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Description

Role Overview

You will own the end-to-end design and delivery of our Master Data Management (MDM) framework Data Quality & Governance (DQG) pipelines and enterprise Data Catalog. Working closely with product analytics and platform engineering teams you will transform fragmented data assets into trusted AI-ready data products enabling everything from self-serve analytics to real-time ML inference. This is a senior individual contributor role with a clear path to staff/principal and early exposure to AI-augmented data management tooling.

Key Responsibilities

1. Master Data Management (MDM)

  • Design and implement a scalable MDM architecture covering customer product and entity master domains.
  • Build and maintain golden record pipelines using entity resolution probabilistic matching and survivorship rules.
  • Leverage Neo4j graph models to represent complex entity relationships and hierarchies that RDBMS cannot capture.
  • Drive cross-functional data stewardship workflows from source profiling to master record certification.

2. Data Quality & Governance (DQG)

  • Establish and operationalise a DQ framework: define critical data elements (CDEs) quality dimensions and SLA thresholds.
  • Build automated DQ checks (completeness uniqueness validity timeliness) integrated into CI/CD pipelines.
  • Instrument data observability tooling (Monte Carlo Soda Core or equivalent) to detect and alert on anomalies in real time.
  • Develop and maintain data governance policies in alignment with GDPR CCPA and ISO 8000 standards.
  • Produce executive-facing data quality scorecards and lineage dashboards.

3. Data Catalog & Metadata Management

  • Own the enterprise data catalog (Collibra Alation or open-source equivalent) including taxonomy glossary and ownership models.
  • Implement active metadata strategies: auto-tagging lineage capture and semantic enrichment using LLMs.
  • Drive catalog adoption across engineering analytics and business teams through self-serve onboarding.
  • Integrate catalog metadata with downstream AI feature stores and ML pipelines to ensure feature provenance and reusability.

4. AI-Augmented Data Management

  • Apply ML and LLM techniques to automate DQ rule generation anomaly detection and metadata enrichment.
  • Build AI-powered entity resolution models (embeddings graph algorithms) to replace rule-based matching.
  • Collaborate with data science teams to deliver clean governed AI-ready datasets for model training and inference.
  • Evaluate and pilot emerging AI data tools (e.g. DataHub AI OpenMetadata custom RAG pipelines over catalog metadata).
  • Contribute to the internal AI data platform roadmap helping define the standards for how AI models consume governed data.

5. Platform Engineering & Delivery

  • Design reusable data pipeline patterns on cloud-native stacks (AWS/GCP/Azure) using Spark dbt Airflow or equivalent.
  • Mentor junior engineers; conduct design reviews and enforce engineering best practices.
  • Partner with data product owners to define and deliver certified data products on a data mesh architecture.

Required Qualifications

  • 8 years in data engineering with demonstrable depth in at least two of: MDM DQG or Data Catalog implementation.
  • Hands-on experience with at least one enterprise MDM platform (Informatica MDM Reltio Semarchy or custom-built).
  • Proficiency in SQL Python and Spark for large-scale data processing.
  • Strong understanding of data governance frameworks (DAMA-DMBOK DCAM or equivalent).
  • Experience with a Data Catalog platform at production scale (Collibra Alation Atlan DataHub or OpenMetadata).
  • Working knowledge of graph data concepts Neo4j Cypher experience is a strong advantage.
  • Hands-on exposure to ML/AI tooling: model training feature engineering or LLM-based automation.
  • Experience operating in cloud environments (AWS Glue GCP Dataplex Azure Purview or equivalent).
  • Strong communication skills able to translate data governance concepts for non-technical stakeholders.

Preferred Qualifications

  • Neo4j Certified Professional or equivalent graph database certification.
  • Experience with data mesh principles and building certified data products.
  • Familiarity with vector databases and embedding-based search (Pinecone Weaviate or pgvector).
  • Contributions to open-source data governance or catalog projects.
  • Background in financial services healthcare or other regulated industries with stringent data compliance requirements.
  • Experience with real-time streaming data quality (Kafka Great Expectations / Soda).




Responsibilities

MDMDQG Data Catalog person with strong knowledge on AI



Qualifications

MDMDQG Data Catalog person with strong knowledge on AI



DescriptionRole OverviewYou will own the end-to-end design and delivery of our Master Data Management (MDM) framework Data Quality & Governance (DQG) pipelines and enterprise Data Catalog. Working closely with product analytics and platform engineering teams you will transform fragmented data assets...

About Company

Company Logo

At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better f ... View more

View Profile View Profile