Blend is hiring a Senior Data Scientist (Generative AI) to spearhead the development of advanced AI-powered classification and matching systems on Databricks. You will contribute to flagship programs like the Diageo AI POC by building RAG pipelines deploying agentic AI workflows and scaling LLM-based solutions for high-precision entity matching and MDM modernization.
Key Responsibilities
- Design and implement end-to-end AI pipelines for product classification fuzzy matching and deduplication using LLMs RAG and Databricks-native workflows.
- Develop scalable reproducible AI solutions within Databricks notebooks and job clusters leveraging Delta Lake MLflow and Unity Catalog.
- Engineer Retrieval-Augmented Generation (RAG) workflows using vector search and integrate with Python-based matching logic.
- Build agent-based automation pipelines (rule-driven GenAI agents) for anomaly detection compliance validation and harmonization logic.
- Implement explainability audit trails and governance-first AI workflows aligned with enterprise-grade MDM needs.
- Collaborate with data engineers BI teams and product owners to integrate GenAI outputs into downstream systems.
- Contribute to modular system design and documentation for long-term scalability and maintainability.
Qualifications :
- Bachelors/Masters in Computer Science Artificial Intelligence or related field.
- 5 years of overall Data Science experience with 2 years in Generative AI / LLM-based applications.
- Deep experience with Databricks ecosystem: Delta Lake MLflow DBFS Databricks Jobs & Workflows.
- Strong Python and PySpark skills with ability to build scalable data pipelines and AI workflows in Databricks.
- Experience with LLMs (e.g. OpenAI LLaMA Mistral) and frameworks like LangChain or LlamaIndex.
- Working knowledge of vector databases (e.g. FAISS Chroma) and prompt engineering for classification/retrieval.
- Exposure to MDM platforms (e.g. Stibo STEP) and familiarity with data harmonization challenges.
- Experience with explainability frameworks (e.g. SHAP LIME) and AI audit tooling.
Preferred Skills
Knowledge of agentic AI architectures and multi-agent orchestration.
Familiarity with Azure Data Hub and enterprise data ingestion frameworks.
Understanding of data governance lineage and regulatory compliance in AI systems.
Additional Information :
Thrive & Grow with Us:
- Competitive Salary: Your skills and contributions are highly valued here and we make sure your salary reflects that rewarding you fairly for the knowledge and experience you bring to the table.
- Dynamic Career Growth: Our vibrant environment offers you the opportunity to grow rapidly providing the right tools mentorship and experiences to fast-track your career.
- Idea Tanks: Innovation lives here. Our Idea Tanks are your playground to pitch experiment and collaborate on ideas that can shape the future.
- Growth Chats: Dive into our casual Growth Chats where you can learn from the bestwhether its over lunch or during a laid-back session with peers its the perfect space to grow your skills.
- Snack Zone: Stay fuelled and inspired! In our Snack Zone youll find a variety of snacks to keep your energy high and ideas flowing.
- Recognition & Rewards: We believe great work deserves to be recognized. Expect regular Hive-Fives shoutouts and the chance to see your ideas come to life as part of our reward program.
- Fuel Your Growth Journey with Certifications: Were all about your growth groove! Level up your skills with our support as we cover the cost of your certifications.
Remote Work :
No
Employment Type :
Full-time