Role: AI Prompt Evaluator (Prompt Reviewer) Remote (India)
Role & Responsibilities
- Review and rate prompts and model outputs against detailed annotation guidelines to assess accuracy relevance safety and alignment.
- Label content for toxicity bias hallucination instruction-following and factuality; flag edge cases and ambiguous prompts for escalation.
- Use annotation platforms to create high-quality training examples maintain consistent labels and meet productivity accuracy targets.
- Provide constructive feedback to prompt engineers and data teamsdocument failure modes suggest prompt modifications and propose test cases.
- Participate in calibration sessions to ensure inter-annotator agreement; help refine guidelines and examples to improve consistency.
- Maintain accurate logs adhere to data security and privacy protocols and contribute to continuous improvement of QA processes.
Skills & Qualifications
Must-Have
- Prompt evaluation
- Large language models
- Labelbox
- Scale AI
- Prodigy
- Toxicity assessment
Preferred
- Experience with inter-annotator agreement processes
- Familiarity with bias and fairness assessment frameworks
- Advanced Google Sheets skills for tracking and reporting
Additional Qualifications
- Proven experience reviewing or annotating AI-generated content training datasets or performing QA for NLP systems.
- Strong attention to detail and ability to follow precise guidelines; comfort working in distributed remote teams in India.
- Comfortable providing clear written feedback and documenting examples of model failure modes.
Benefits & Culture Highlights
- Fully remote role with a flexible work culture built for knowledge workers across India.
- Opportunity to shape model safety and dataset quality at an early stagedirect impact on product and research outcomes.
- Collaborative environment with prompt engineers data scientists and trust & safety experts; ongoing learning and upskilling.
To apply highlight relevant annotation or AI-evaluation experience and include examples of past dataset work or evaluation projects. Career Bloc values rigorous quality standards and is committed to building safe aligned AI systems.
Required Skills:
trainingpromptannotationcontentgoogle sheetsassessmentdata
Role: AI Prompt Evaluator (Prompt Reviewer) Remote (India)Role & ResponsibilitiesReview and rate prompts and model outputs against detailed annotation guidelines to assess accuracy relevance safety and alignment.Label content for toxicity bias hallucination instruction-following and factuality; fl...
Role: AI Prompt Evaluator (Prompt Reviewer) Remote (India)
Role & Responsibilities
- Review and rate prompts and model outputs against detailed annotation guidelines to assess accuracy relevance safety and alignment.
- Label content for toxicity bias hallucination instruction-following and factuality; flag edge cases and ambiguous prompts for escalation.
- Use annotation platforms to create high-quality training examples maintain consistent labels and meet productivity accuracy targets.
- Provide constructive feedback to prompt engineers and data teamsdocument failure modes suggest prompt modifications and propose test cases.
- Participate in calibration sessions to ensure inter-annotator agreement; help refine guidelines and examples to improve consistency.
- Maintain accurate logs adhere to data security and privacy protocols and contribute to continuous improvement of QA processes.
Skills & Qualifications
Must-Have
- Prompt evaluation
- Large language models
- Labelbox
- Scale AI
- Prodigy
- Toxicity assessment
Preferred
- Experience with inter-annotator agreement processes
- Familiarity with bias and fairness assessment frameworks
- Advanced Google Sheets skills for tracking and reporting
Additional Qualifications
- Proven experience reviewing or annotating AI-generated content training datasets or performing QA for NLP systems.
- Strong attention to detail and ability to follow precise guidelines; comfort working in distributed remote teams in India.
- Comfortable providing clear written feedback and documenting examples of model failure modes.
Benefits & Culture Highlights
- Fully remote role with a flexible work culture built for knowledge workers across India.
- Opportunity to shape model safety and dataset quality at an early stagedirect impact on product and research outcomes.
- Collaborative environment with prompt engineers data scientists and trust & safety experts; ongoing learning and upskilling.
To apply highlight relevant annotation or AI-evaluation experience and include examples of past dataset work or evaluation projects. Career Bloc values rigorous quality standards and is committed to building safe aligned AI systems.
Required Skills:
trainingpromptannotationcontentgoogle sheetsassessmentdata
View more
View less