DE&A Core Project Management Project Management
Job Summary
Role Overview
You will own the end-to-end design and delivery of our Master Data Management (MDM) framework Data Quality & Governance (DQG) pipelines and enterprise Data Catalog. Working closely with product analytics and platform engineering teams you will transform fragmented data assets into trusted AI-ready data products enabling everything from self-serve analytics to real-time ML inference. This is a senior individual contributor role with a clear path to staff/principal and early exposure to AI-augmented data management tooling.
Key Responsibilities
1. Master Data Management (MDM)
- Design and implement a scalable MDM architecture covering customer product and entity master domains.
- Build and maintain golden record pipelines using entity resolution probabilistic matching and survivorship rules.
- Leverage Neo4j graph models to represent complex entity relationships and hierarchies that RDBMS cannot capture.
- Drive cross-functional data stewardship workflows from source profiling to master record certification.
2. Data Quality & Governance (DQG)
- Establish and operationalise a DQ framework: define critical data elements (CDEs) quality dimensions and SLA thresholds.
- Build automated DQ checks (completeness uniqueness validity timeliness) integrated into CI/CD pipelines.
- Instrument data observability tooling (Monte Carlo Soda Core or equivalent) to detect and alert on anomalies in real time.
- Develop and maintain data governance policies in alignment with GDPR CCPA and ISO 8000 standards.
- Produce executive-facing data quality scorecards and lineage dashboards.
3. Data Catalog & Metadata Management
- Own the enterprise data catalog (Collibra Alation or open-source equivalent) including taxonomy glossary and ownership models.
- Implement active metadata strategies: auto-tagging lineage capture and semantic enrichment using LLMs.
- Drive catalog adoption across engineering analytics and business teams through self-serve onboarding.
- Integrate catalog metadata with downstream AI feature stores and ML pipelines to ensure feature provenance and reusability.
4. AI-Augmented Data Management
- Apply ML and LLM techniques to automate DQ rule generation anomaly detection and metadata enrichment.
- Build AI-powered entity resolution models (embeddings graph algorithms) to replace rule-based matching.
- Collaborate with data science teams to deliver clean governed AI-ready datasets for model training and inference.
- Evaluate and pilot emerging AI data tools (e.g. DataHub AI OpenMetadata custom RAG pipelines over catalog metadata).
- Contribute to the internal AI data platform roadmap helping define the standards for how AI models consume governed data.
5. Platform Engineering & Delivery
- Design reusable data pipeline patterns on cloud-native stacks (AWS/GCP/Azure) using Spark dbt Airflow or equivalent.
- Mentor junior engineers; conduct design reviews and enforce engineering best practices.
- Partner with data product owners to define and deliver certified data products on a data mesh architecture.
Required Qualifications
- 8 years in data engineering with demonstrable depth in at least two of: MDM DQG or Data Catalog implementation.
- Hands-on experience with at least one enterprise MDM platform (Informatica MDM Reltio Semarchy or custom-built).
- Proficiency in SQL Python and Spark for large-scale data processing.
- Strong understanding of data governance frameworks (DAMA-DMBOK DCAM or equivalent).
- Experience with a Data Catalog platform at production scale (Collibra Alation Atlan DataHub or OpenMetadata).
- Working knowledge of graph data concepts Neo4j Cypher experience is a strong advantage.
- Hands-on exposure to ML/AI tooling: model training feature engineering or LLM-based automation.
- Experience operating in cloud environments (AWS Glue GCP Dataplex Azure Purview or equivalent).
- Strong communication skills able to translate data governance concepts for non-technical stakeholders.
Preferred Qualifications
- Neo4j Certified Professional or equivalent graph database certification.
- Experience with data mesh principles and building certified data products.
- Familiarity with vector databases and embedding-based search (Pinecone Weaviate or pgvector).
- Contributions to open-source data governance or catalog projects.
- Background in financial services healthcare or other regulated industries with stringent data compliance requirements.
- Experience with real-time streaming data quality (Kafka Great Expectations / Soda).
Responsibilities
MDMDQG Data Catalog person with strong knowledge on AI
Qualifications
MDMDQG Data Catalog person with strong knowledge on AI
About Company
At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better f ... View more