Freelance AI Evaluation Consultant

Toloka

Not Interested
Bookmark
Report This Job

profile Job Location:

Amsterdam - Netherlands

profile Monthly Salary: Not Disclosed
Posted on: 6 hours ago
Vacancies: 1 Vacancy

Job Summary

Company Intro

AtToloka AIwe create data that powers leading GenAI models and innovations. We work with frontier labs big tech renowned AI startups enterprises and non-profit research organizations worldwide. We use a combination of Experts Crowd Tech Platform to teach AI models to reason and evaluate their efficacy and safety. We have experts in more than 50 different domainsfrom doctors and lawyers to physicists and engineersand boast one of the most diverse global crowdsrepresenting over100 countries and speaking 40 languages. We are a well-funded startup with an enviable portfolio of clients includingAnthropicAmazonMicrosoftpoolsideRecraft andShopify.

Recently we secured strategic investment led byBezos Expeditionswith participation fromMikhail ParakhinCTO of Shopifyand board advisor to leading GenAI companies who now serves as our Chairman of the Board. Our remote-first team is globally distributed around the world:USAUKthe NetherlandsIsraelCzech RepublicSerbia and more. We are headquartered in Amsterdam.

About the role:

We are seeking an analytical and technically-minded professional to:

  • Evaluate AI outputs and processes
  • Ensure quality accuracy and reliability
  • Identify logical errors risks and structural inconsistencies
  • Provide actionable insights and recommendations to the team

Ideal candidates:

  • Consultants auditors analysts data researchers or business/technical analysts with strong reasoning skills
  • Professionals curious about AI process improvement and quality evaluation
  • Problem-solvers who enjoy analyzing complex systems logic and scenarios

Key Responsibilities:

  • Lead evaluation of AI outputs and related processes
  • Review tasks against expected/ideal scenarios; identify gaps and risks
  • Provide structured actionable recommendations to engineers domain experts and managers
  • Maintain and improve evaluation guidelines checklists SOPs
  • Suggest new approaches tools and processes to enhance AI evaluation

Experience & Background:

  • Scenario validation data analysis auditing or consulting experience
  • Analytical work in research technical/business analysis or risk evaluation

Knowledge & Skills:

  • Strong analytical and critical thinking
  • Attention to detail reliability and an ownership mindset
  • Technical understanding: JSON/YAML basic Git/GitHub
  • Clear English (B2) for communication and documentation
  • Independent proactive mindset

Nice to Have:

  • Scenario-based testing annotation workflows AI/LLM evaluation
  • Experience in cross-functional teams


Important Notice Scam Alert Regarding Fake Job Postings

It has come to our attention that an individual or group is fraudulently impersonating Toloka to post fake jobs and solicit personal information from be aware:

Company IntroAtToloka AIwe create data that powers leading GenAI models and innovations. We work with frontier labs big tech renowned AI startups enterprises and non-profit research organizations worldwide. We use a combination of Experts Crowd Tech Platform to teach AI models to reason and evalua...
View more view more

Key Skills

  • Cobol
  • Microsoft Publisher
  • Industrial Engineering
  • Xero
  • NIS
  • Constant Contact
  • Adobe Flash
  • Samba
  • Creative Writing
  • Kendo
  • Management Consulting
  • Project Coordination