Data Scientist with 3-4 years of experience to play a critical role in enhancing Large Language Model (LLM) development lifecycle. Will be responsible for designing and building sophisticated LLM-assisted Quality Assurance (QA) solutions.
The primary goal is to analyze model failures identify data gaps and create real-time tools that guide our human data generators to produce high-impact training data. This role is highly analytical and technical sitting at the critical intersection of model evaluation data analysis and human-in-the-loop process improvement.
Key Responsibilities
Develop LLM-Assisted QA Solutions: Design build and deploy intelligent tools that assist human data generators in real-time verifying that new data aligns with identified model needs.
Analyze Model Failures: Conduct deep-dive analyses into model failure modes to identify and categorize new loss patterns and emerging weaknesses.
Run Studies: Systematically design and execute experiments to understand model behavior and pinpoint the root causes of errors.
Define Data Requirements: Translate your analysis of model failures into specific actionable data requirements for our human data generation teams to target for model improvement.
Create Quality Rubrics: Develop document and maintain comprehensive quality control rubrics and evaluation metrics. These rubrics must be adaptable across a wide variety of use cases domains and industry sectors.
Verify Data Generation: Build processes to validate that the human-generated data effectively targets and suits the existing and newly identified loss patterns.
Collaborate Cross-Functionally: Work closely with ML Engineers AI Researchers and Data Operations teams to ensure your QA solutions and insights are seamlessly integrated into the model training and deployment pipeline.
Required Qualifications
Experience: 3-4 years of professional experience in Data Science Machine Learning Engineering or a related role with a focus on NLP.
Education: Bachelors or Masters degree in Computer Science Data Science Statistics Computational Linguistics or a related quantitative field.
LLM/NLP Expertise: Strong hands-on experience with Large Language Models (LLMs) NLP techniques and the modern transformer ecosystem (e.g. transformers library GPT-family BERT T5).
Technical Skills: High proficiency in Python and standard data science/ML libraries (e.g. Pandas NumPy Scikit-learn PyTorch/TensorFlow).
Analytical Mindset: Proven ability to perform deep rigorous analysis on complex and often unstructured data (model outputs failure logs) to derive actionable insights.
Strong Communication: Excellent ability to create clear concise documentation (especially technical rubrics) and communicate complex findings to both technical and non-technical stakeholders.
Preferred Qualifications (Nice-to-Have)
Direct experience building human-in-the-loop (HITL) systems or data annotation/QA tools.
Experience in experimental design and A/B testing within an ML context.
Familiarity with data-centric AI principles and practices.
Background in MLOps (e.g. experiment tracking model versioning deployment).
Experience working in a fast-paced R&D or product-driven environment.
Data Scientist with 3-4 years of experience to play a critical role in enhancing Large Language Model (LLM) development lifecycle. Will be responsible for designing and building sophisticated LLM-assisted Quality Assurance (QA) solutions.The primary goal is to analyze model failures identify data ga...
Data Scientist with 3-4 years of experience to play a critical role in enhancing Large Language Model (LLM) development lifecycle. Will be responsible for designing and building sophisticated LLM-assisted Quality Assurance (QA) solutions.
The primary goal is to analyze model failures identify data gaps and create real-time tools that guide our human data generators to produce high-impact training data. This role is highly analytical and technical sitting at the critical intersection of model evaluation data analysis and human-in-the-loop process improvement.
Key Responsibilities
Develop LLM-Assisted QA Solutions: Design build and deploy intelligent tools that assist human data generators in real-time verifying that new data aligns with identified model needs.
Analyze Model Failures: Conduct deep-dive analyses into model failure modes to identify and categorize new loss patterns and emerging weaknesses.
Run Studies: Systematically design and execute experiments to understand model behavior and pinpoint the root causes of errors.
Define Data Requirements: Translate your analysis of model failures into specific actionable data requirements for our human data generation teams to target for model improvement.
Create Quality Rubrics: Develop document and maintain comprehensive quality control rubrics and evaluation metrics. These rubrics must be adaptable across a wide variety of use cases domains and industry sectors.
Verify Data Generation: Build processes to validate that the human-generated data effectively targets and suits the existing and newly identified loss patterns.
Collaborate Cross-Functionally: Work closely with ML Engineers AI Researchers and Data Operations teams to ensure your QA solutions and insights are seamlessly integrated into the model training and deployment pipeline.
Required Qualifications
Experience: 3-4 years of professional experience in Data Science Machine Learning Engineering or a related role with a focus on NLP.
Education: Bachelors or Masters degree in Computer Science Data Science Statistics Computational Linguistics or a related quantitative field.
LLM/NLP Expertise: Strong hands-on experience with Large Language Models (LLMs) NLP techniques and the modern transformer ecosystem (e.g. transformers library GPT-family BERT T5).
Technical Skills: High proficiency in Python and standard data science/ML libraries (e.g. Pandas NumPy Scikit-learn PyTorch/TensorFlow).
Analytical Mindset: Proven ability to perform deep rigorous analysis on complex and often unstructured data (model outputs failure logs) to derive actionable insights.
Strong Communication: Excellent ability to create clear concise documentation (especially technical rubrics) and communicate complex findings to both technical and non-technical stakeholders.
Preferred Qualifications (Nice-to-Have)
Direct experience building human-in-the-loop (HITL) systems or data annotation/QA tools.
Experience in experimental design and A/B testing within an ML context.
Familiarity with data-centric AI principles and practices.
Background in MLOps (e.g. experiment tracking model versioning deployment).
Experience working in a fast-paced R&D or product-driven environment.
View more
View less