Job Title: Sr AI/ML Engineer
Location: Bellevue WA/ Atlanta GA/ Overland park KS
Duration: / Term: C2C
Experience Desired: 10 Years
Job Responsibilities - Identity Resolution
- Develop and deploy entity resolution models to match and deduplicate customer records across multiple systems - directly impacting the accuracy of CDP as the source of truth
- Implement probabilistic matching techniques (e.g. Fellegi-Sunter) and ML models (gradient boosting neural classifiers) for record linkage across the US adult population
- Build candidate blocking pipelines using phonetic algorithms (Soundex Double Metaphone) token similarity and LSH to handle billions of potential match pairs efficiently
- Apply fuzzy matching techniques (Levenshtein Jaro-Winkler Jaccard) for customer attributes such as name address phone and identifiers
- Develop clustering algorithms (DBSCAN hierarchical clustering) to create unified golden customer profiles that serve as the authoritative representation of each individual
- Build embedding-based similarity systems using Sentence-BERT or transformer-based models for semantic matching
- Implement ANN/KNN retrieval systems (FAISS Annoy) for large-scale entity matching across population-scale datasets
Job Responsibilities - AI/LLM
- Use LLMs (e.g. GPT Claude) for classification and disambiguation of entity matches improving resolution accuracy where traditional methods fall short
- Build and support RAG pipelines to enrich customer profiles with contextual data from unstructured sources
- Perform prompt engineering and evaluation for structured data extraction from unstructured inputs feeding into CDP
- Contribute to NLQ-to-SQL systems enabling business users to query CDP data using natural language - making the authoritative source of truth accessible to non-technical stakeholders
- Support integration with vector databases (e.g. Pinecone pgvector Qdrant) for semantic search across customer data
Education and Work Experience
- Bachelors or Masters degree in Computer Science Data Science or related field
- 3 years of experience in ML/AI engineering
- At least 1 year of experience in entity resolution record linkage or deduplication - ideally at scale
Technical Skills
- Programming: Python (required)
- Libraries: scikit-learn HuggingFace Transformers RapidFuzz jellyfish
- Experience with LLM APIs (OpenAI Anthropic) and prompt pipelines
- Strong SQL skills and experience with Spark or Dask for distributed processing
- Familiarity with vector databases and embedding-based retrieval
- Experience with ML lifecycle tools (MLflow or similar)
- Understanding of data quality metrics and how identity resolution impacts downstream trust
Knowledge Skills and Abilities
- Strong understanding of ML fundamentals and similarity matching techniques applied to customer identity
- Ability to work with large messy real-world datasets spanning hundreds of millions of records
- Understanding of precision/recall tradeoffs in identity resolution and their impact on data trust
- Good problem-solving and analytical skills
- Ability to collaborate with data engineering platform and business teams to deliver accurate customer profiles
Key Skills:
Machine Learning Generative AI NLP Fraud Detection Agentic AI LangChain LangGraph.
Job Title: Sr AI/ML Engineer Location: Bellevue WA/ Atlanta GA/ Overland park KS Duration: / Term: C2C Experience Desired: 10 Years Job Responsibilities - Identity Resolution Develop and deploy entity resolution models to match and deduplicate customer records across multiple systems - directly im...
Job Title: Sr AI/ML Engineer
Location: Bellevue WA/ Atlanta GA/ Overland park KS
Duration: / Term: C2C
Experience Desired: 10 Years
Job Responsibilities - Identity Resolution
- Develop and deploy entity resolution models to match and deduplicate customer records across multiple systems - directly impacting the accuracy of CDP as the source of truth
- Implement probabilistic matching techniques (e.g. Fellegi-Sunter) and ML models (gradient boosting neural classifiers) for record linkage across the US adult population
- Build candidate blocking pipelines using phonetic algorithms (Soundex Double Metaphone) token similarity and LSH to handle billions of potential match pairs efficiently
- Apply fuzzy matching techniques (Levenshtein Jaro-Winkler Jaccard) for customer attributes such as name address phone and identifiers
- Develop clustering algorithms (DBSCAN hierarchical clustering) to create unified golden customer profiles that serve as the authoritative representation of each individual
- Build embedding-based similarity systems using Sentence-BERT or transformer-based models for semantic matching
- Implement ANN/KNN retrieval systems (FAISS Annoy) for large-scale entity matching across population-scale datasets
Job Responsibilities - AI/LLM
- Use LLMs (e.g. GPT Claude) for classification and disambiguation of entity matches improving resolution accuracy where traditional methods fall short
- Build and support RAG pipelines to enrich customer profiles with contextual data from unstructured sources
- Perform prompt engineering and evaluation for structured data extraction from unstructured inputs feeding into CDP
- Contribute to NLQ-to-SQL systems enabling business users to query CDP data using natural language - making the authoritative source of truth accessible to non-technical stakeholders
- Support integration with vector databases (e.g. Pinecone pgvector Qdrant) for semantic search across customer data
Education and Work Experience
- Bachelors or Masters degree in Computer Science Data Science or related field
- 3 years of experience in ML/AI engineering
- At least 1 year of experience in entity resolution record linkage or deduplication - ideally at scale
Technical Skills
- Programming: Python (required)
- Libraries: scikit-learn HuggingFace Transformers RapidFuzz jellyfish
- Experience with LLM APIs (OpenAI Anthropic) and prompt pipelines
- Strong SQL skills and experience with Spark or Dask for distributed processing
- Familiarity with vector databases and embedding-based retrieval
- Experience with ML lifecycle tools (MLflow or similar)
- Understanding of data quality metrics and how identity resolution impacts downstream trust
Knowledge Skills and Abilities
- Strong understanding of ML fundamentals and similarity matching techniques applied to customer identity
- Ability to work with large messy real-world datasets spanning hundreds of millions of records
- Understanding of precision/recall tradeoffs in identity resolution and their impact on data trust
- Good problem-solving and analytical skills
- Ability to collaborate with data engineering platform and business teams to deliver accurate customer profiles
Key Skills:
Machine Learning Generative AI NLP Fraud Detection Agentic AI LangChain LangGraph.
View more
View less