Senior AIML Engineer — Customer Data Platform

Sumeru Solutions

Not Interested
Bookmark
Report This Job

profile Job Location:

Atlanta, GA - USA

profile Monthly Salary: Not Disclosed
Posted on: 5 days ago
Vacancies: 1 Vacancy

Job Summary

Job Overview

We are seeking an AI/ML Engineer to build the intelligent systems that power identity resolution and data accessibility within our Customer Data Platform (CDP) - the authoritative source of truth for customer data across the entire US adult population.

This role focuses on developing machine learning pipelines that deduplicate link and resolve customer identities across disparate data sources - the core capability that transforms raw data into trusted unified customer profiles. You will also contribute to LLM-based solutions that enable natural language querying of CDP data making the platform accessible to business users across the organization.

You will work on both classical ML techniques and modern LLM-based approaches to ensure that every customer identity in CDP is accurately resolved every profile is trustworthy and every user can access the data they need.

Job Responsibilities - Identity Resolution

  • Develop and deploy entity resolution models to match and deduplicate customer records across multiple systems - directly impacting the accuracy of CDP as the source of truth
  • Implement probabilistic matching techniques (e.g. Fellegi-Sunter) and ML models (gradient boosting neural classifiers) for record linkage across the US adult population
  • Build candidate blocking pipelines using phonetic algorithms (Soundex Double Metaphone) token similarity and LSH to handle billions of potential match pairs efficiently
  • Apply fuzzy matching techniques (Levenshtein Jaro-Winkler Jaccard) for customer attributes such as name address phone and identifiers
  • Develop clustering algorithms (DBSCAN hierarchical clustering) to create unified golden customer profiles that serve as the authoritative representation of each individual
  • Build embedding-based similarity systems using Sentence-BERT or transformer-based models for semantic matching
  • Implement ANN/KNN retrieval systems (FAISS Annoy) for large-scale entity matching across population-scale datasets

Job Responsibilities - AI/LLM

  • Use LLMs (e.g. GPT Claude) for classification and disambiguation of entity matches improving resolution accuracy where traditional methods fall short
  • Build and support RAG pipelines to enrich customer profiles with contextual data from unstructured sources
  • Perform prompt engineering and evaluation for structured data extraction from unstructured inputs feeding into CDP
  • Contribute to NLQ-to-SQL systems enabling business users to query CDP data using natural language - making the authoritative source of truth accessible to non-technical stakeholders
  • Support integration with vector databases (e.g. Pinecone pgvector Qdrant) for semantic search across customer data

Education and Work Experience

  • Bachelors or Masters degree in Computer Science Data Science or related field
  • 3 years of experience in ML/AI engineering
  • At least 1 year of experience in entity resolution record linkage or deduplication - ideally at scale

Technical Skills

  • Programming: Python (required)
  • Libraries: scikit-learn HuggingFace Transformers RapidFuzz jellyfish
  • Experience with LLM APIs (OpenAI Anthropic) and prompt pipelines
  • Strong SQL skills and experience with Spark or Dask for distributed processing
  • Familiarity with vector databases and embedding-based retrieval
  • Experience with ML lifecycle tools (MLflow or similar)
  • Understanding of data quality metrics and how identity resolution impacts downstream trust

Knowledge Skills and Abilities

  • Strong understanding of ML fundamentals and similarity matching techniques applied to customer identity
  • Ability to work with large messy real-world datasets spanning hundreds of millions of records
  • Understanding of precision/recall tradeoffs in identity resolution and their impact on data trust
  • Good problem-solving and analytical skills
  • Ability to collaborate with data engineering platform and business teams to deliver accurate customer profiles
Job Overview We are seeking an AI/ML Engineer to build the intelligent systems that power identity resolution and data accessibility within our Customer Data Platform (CDP) - the authoritative source of truth for customer data across the entire US adult population. This role focuses on developing ma...
View more view more