Data Scientist, AI Data Foundations

Nextdeavor


Job Location:

Irvine, CA - USA

Yearly Salary: USD 114000 - 175000
Posted on: 19 hours ago
Vacancies: 1 Vacancy

Job Summary

Data Scientist AI Data Foundations

Full-time
Remote
Exclusive confidential search details shared with qualified applicants.

Become a Key Player as a Data Scientist AI Data Foundations

You will design and build the curated data structures that AI and ML applications consume enabling higher-quality model training and inference. You will partner with model builders product risk and growth stakeholders to surface actionable insights and ship production-ready vector feature and graph data assets. This is a Remote role.

Heres How Youll Make an Impact on the Team

  • Build and maintain vector stores for RAG including embedding pipelines chunking strategies indexing and refresh patterns.
  • Own the feature store: design build and operate feature definitions freshness SLAs lineage and point-in-time correctness for offline/online use.
  • Design and implement graph data structures to model relationships across applicants applications products lenders decisions and outcomes.
  • Lead data discovery: profile lending deposit and behavioral datasets to identify trends segments anomalies and model drivers; produce actionable hypotheses for stakeholders.
  • Engineer curated AI-ready datasets with appropriate quality checks documentation and governance for downstream model builders and analysts.
  • Define and run evaluation frameworks for RAG retrieval quality feature drift embedding quality and graph completeness; iterate on metrics.
  • Partner closely with ML engineers and applied scientists to ensure data assets accelerate model development and serving workflows.
  • Champion responsible data use by collaborating with governance security and compliance teams to ensure data classification consent and regulatory boundaries are respected.
  • Communicate findings via write-ups notebooks dashboards and short presentations for technical and non-technical audiences.

Heres What Youll Need to Be Successful in This Role

  • 47 years of experience in data science ML engineering or applied data roles with significant time building data assets consumed by models or applications.
  • Hands-on experience designing and operating vector stores for RAG or semantic search (embedding generation chunking indexing retrieval evaluation).
  • Experience building or operating a feature store (e.g. Databricks Feature Store Feast or custom) including offline training and online serving patterns and point-in-time correctness.
  • Experience modeling and building graph data structures and writing graph queries (Neo4j TigerGraph Cosmos DB Gremlin or similar).
  • Strong proficiency in Python (pandas NumPy scikit-learn PySpark) and SQL; comfortable using Databricks notebooks and jobs.
  • Practical experience with embedding models and LLM tooling (Hugging Face OpenAI/Azure OpenAI APIs LangChain or similar) in production or near-production contexts.
  • Demonstrated data discovery skills: profiling messy datasets surfacing patterns validating findings statistically and explaining results clearly.
  • Solid grounding in classical ML concepts (supervised vs. unsupervised learning train/test discipline leakage evaluation metrics).
  • Strong written and verbal communication skills for technical and business audiences.

Heres What Else Might Help You Out

  • Experience in SaaS or FinTech especially with lending deposit credit fraud or KYC/AML data.
  • Familiarity with Databricks-native AI/ML tooling: Databricks Vector Search Databricks Feature Store MLflow Unity Catalog.
  • Experience with open-source vector DBs (pgvector Pinecone Weaviate Chroma FAISS) and strong opinions on trade-offs.
  • Experience with Microsoft Azure data and AI services (Azure OpenAI Azure AI Search ADLS Gen2).
  • Experience evaluating RAG systems end-to-end ( faithfulness answer quality hallucination measurement).
  • Exposure to graph algorithms (community detection link prediction centrality) applied to business problems.
  • Bachelors or Masters in CS Statistics Mathematics Engineering or related quantitative field or equivalent experience.

Pay Range

$114000 - $175000/year

Ready to Make Your Mark

This role may fill quickly. Submit your resume to be considered.

Apply with Pioneers here


Required Experience:

IC

Data Scientist AI Data FoundationsFull-timeRemoteExclusive confidential search details shared with qualified applicants.Become a Key Player as a Data Scientist AI Data FoundationsYou will design and build the curated data structures that AI and ML applications consume enabling higher-quality model ...

About Company

Company Logo

Hire trusted candidates who BELONG STAY ADVANCE NextDeavor is a recruiting agency helping companies make more strategic hiring decisions. FIND YOUR NEXT GREAT HIRE Using AI technology to make the recruiting process more human AI speeds up, refines, and expands our initial search. This ... View more

View Profile View Profile