Own the scientific and methodological side of GenAI delivery: problem framing feasibility assessment experimental design evaluation strategy metric selection ground-truth creation and decisioning on model and prompting approaches. Youll build and validate GenAI/agentic prototypes define what good means and ensure solutions are measurably effective and safe before and after launch. You will build the GenAI MVP solution in a production-intent way (model choice RAG/agent behaviour prompts and evaluation). AI Engineering will lead the overall design and will partner with you to harden optimise integrate and scale the MVP into an enterprise-grade service.
Key Responsibilities:
- Translate business needs into testable GenAI hypotheses clear outputs and measurable success criteria; define scope boundaries (what the system should not attempt) including risks.
- Run feasibility assessments to choose the right approach: prompting vs RAG vs fine-tuning vs classical ML.
- Select and develop models based on task requirements (reasoning vs extraction vs classification) working with AI Engineering to understand latency/cost and risk profile.
- Design prompting strategies: instruction design few-shot sets structured outputs tool/agent prompts and robustness patterns. This will be implemented as an MVP and iterate based on eval results.
- Establish prompt iteration methodology driven by evals (not anecdotal testing): prompt versioning ablations and change control.
- Define the evaluation plan for GenAI systems and agentic workflows- designing and implementing evaluation from LLM as a judge thresholds and metric creation i.e. Ensure evaluation includes fairness and bias considerations where applicable. Define acceptance thresholds and release (go/no-go) gates tied to these metrics.
- Own experimentation and model improvements: Run structured experiments (across prompts retrievers chunking models).
- Develop out methods for identifying model failures such as hallucination types retrieval misses instruction-following errors formatting failures etc
- Provide recommendations for improvements grounded in evidence: what to change expected lift and tradeoffs.
- Deliver an engineering-ready handoff: prompt packages and versioning approach RAG configuration tool schemas (if agentic) evaluation harness datasets/ground truth metric definitions and go/no-go gates.
Required Collaboration Model:
- Act as the GenAI DS lead in project delivery: align stakeholders on success metrics evaluation readouts and go/no-go decisions.
- Partner AI engineering for LLM implementation needs by providing clear specs (prompts/tool schemas) eval harnesses and acceptance thresholds.
- Mentor DS/analysts on GenAI evaluation methods labelling operations and scientific rigor.
- With Product and Software Engineers for integrating AI capabilities into platforms and user-facing services.
- With DevOps/Platform Engineers for environment setup monitoring infrastructure and reliability.
- With Data Engineering for designing and accessing upstream data pipelines.
Qualifications :
- 7 years of overall AI/ML experience including 2 years of Generative AI solutions
- Strong background in applied ML / data science with demonstrated GenAI delivery experience
- Deep expertise in evaluation design metrics and dataset curation for LLM systems
- Proven experience in model selection and prompt engineering including structured output and tool-use prompting
- Strong proficiency in Python and major ML frameworks (PyTorch TensorFlow Scikit-learn).
- Experience in LLM fine-tuning prompt engineering or AI solution integration with enterprise applications.
- Familiarity with RAG design choices (chunking embeddings retrieval strategies reranking) and how to evaluate them.
- Comfortable working with Azure GenAI ecosystem (Azure OpenAI / Azure AI Foundry) from a consumer/solution perspective.
- Proven ability to build end-to-end GenAI MVPs in Python (RAG/agents evaluation harness) and prepare them for production handoff.
- Excellent communication and stakeholder management skills with a strategic mindset.
Additional Information :
Why Blend360
- Impactful Technical Work: Be at the forefront of AI innovation designing and implementing cutting-edge technical solutions for leading companies and making a tangible impact on their businesses.
- Growth Opportunities: Thrive in a company and innovative team committed to growth providing a platform for your technical and professional development.
- Collaborative Culture: Work alongside a team of world-class experts in data science AI and technology fostering an environment of learning sharing and mutual support on complex technical challenges.
- Bold Vision: Join a company that is brave goes the extra mile to innovate and delivers bold visions for the future of AI.
- If you are a visionary & passionate about leveraging AI and GenAI to drive business transformation and are excited by the prospect of shaping the future of our clients we encourage you to apply!
Remote Work :
No
Employment Type :
Full-time
Own the scientific and methodological side of GenAI delivery: problem framing feasibility assessment experimental design evaluation strategy metric selection ground-truth creation and decisioning on model and prompting approaches. Youll build and validate GenAI/agentic prototypes define what good me...
Own the scientific and methodological side of GenAI delivery: problem framing feasibility assessment experimental design evaluation strategy metric selection ground-truth creation and decisioning on model and prompting approaches. Youll build and validate GenAI/agentic prototypes define what good means and ensure solutions are measurably effective and safe before and after launch. You will build the GenAI MVP solution in a production-intent way (model choice RAG/agent behaviour prompts and evaluation). AI Engineering will lead the overall design and will partner with you to harden optimise integrate and scale the MVP into an enterprise-grade service.
Key Responsibilities:
- Translate business needs into testable GenAI hypotheses clear outputs and measurable success criteria; define scope boundaries (what the system should not attempt) including risks.
- Run feasibility assessments to choose the right approach: prompting vs RAG vs fine-tuning vs classical ML.
- Select and develop models based on task requirements (reasoning vs extraction vs classification) working with AI Engineering to understand latency/cost and risk profile.
- Design prompting strategies: instruction design few-shot sets structured outputs tool/agent prompts and robustness patterns. This will be implemented as an MVP and iterate based on eval results.
- Establish prompt iteration methodology driven by evals (not anecdotal testing): prompt versioning ablations and change control.
- Define the evaluation plan for GenAI systems and agentic workflows- designing and implementing evaluation from LLM as a judge thresholds and metric creation i.e. Ensure evaluation includes fairness and bias considerations where applicable. Define acceptance thresholds and release (go/no-go) gates tied to these metrics.
- Own experimentation and model improvements: Run structured experiments (across prompts retrievers chunking models).
- Develop out methods for identifying model failures such as hallucination types retrieval misses instruction-following errors formatting failures etc
- Provide recommendations for improvements grounded in evidence: what to change expected lift and tradeoffs.
- Deliver an engineering-ready handoff: prompt packages and versioning approach RAG configuration tool schemas (if agentic) evaluation harness datasets/ground truth metric definitions and go/no-go gates.
Required Collaboration Model:
- Act as the GenAI DS lead in project delivery: align stakeholders on success metrics evaluation readouts and go/no-go decisions.
- Partner AI engineering for LLM implementation needs by providing clear specs (prompts/tool schemas) eval harnesses and acceptance thresholds.
- Mentor DS/analysts on GenAI evaluation methods labelling operations and scientific rigor.
- With Product and Software Engineers for integrating AI capabilities into platforms and user-facing services.
- With DevOps/Platform Engineers for environment setup monitoring infrastructure and reliability.
- With Data Engineering for designing and accessing upstream data pipelines.
Qualifications :
- 7 years of overall AI/ML experience including 2 years of Generative AI solutions
- Strong background in applied ML / data science with demonstrated GenAI delivery experience
- Deep expertise in evaluation design metrics and dataset curation for LLM systems
- Proven experience in model selection and prompt engineering including structured output and tool-use prompting
- Strong proficiency in Python and major ML frameworks (PyTorch TensorFlow Scikit-learn).
- Experience in LLM fine-tuning prompt engineering or AI solution integration with enterprise applications.
- Familiarity with RAG design choices (chunking embeddings retrieval strategies reranking) and how to evaluate them.
- Comfortable working with Azure GenAI ecosystem (Azure OpenAI / Azure AI Foundry) from a consumer/solution perspective.
- Proven ability to build end-to-end GenAI MVPs in Python (RAG/agents evaluation harness) and prepare them for production handoff.
- Excellent communication and stakeholder management skills with a strategic mindset.
Additional Information :
Why Blend360
- Impactful Technical Work: Be at the forefront of AI innovation designing and implementing cutting-edge technical solutions for leading companies and making a tangible impact on their businesses.
- Growth Opportunities: Thrive in a company and innovative team committed to growth providing a platform for your technical and professional development.
- Collaborative Culture: Work alongside a team of world-class experts in data science AI and technology fostering an environment of learning sharing and mutual support on complex technical challenges.
- Bold Vision: Join a company that is brave goes the extra mile to innovate and delivers bold visions for the future of AI.
- If you are a visionary & passionate about leveraging AI and GenAI to drive business transformation and are excited by the prospect of shaping the future of our clients we encourage you to apply!
Remote Work :
No
Employment Type :
Full-time
View more
View less