Mercor is partnering with a top AI research organization to evaluate and improve how coding assistants reason act and communicate during development workflows. Were seeking technically sharp experts (especially those with experience in code review testing or documentation) to assess full transcripts of userAI coding conversations. This short-term fully remote engagement helps shape the future of developer-assisting AI systems.
Key Responsibilities
Review long-form transcripts between users and AI coding assistants
Analyze the AIs logic execution and stated actions in detail
Score each transcript using a 10-point rubric across multiple criteria
Optionally write brief justifications citing examples from the dialogue
Detect mismatches between claims and actions (e.g. saying Ill run tests but not doing so)
Ideal Qualifications
Top choices:
Senior or Staff Engineers with deep code review experience and execution insight
QA Engineers with strong verification and consistency-checking habits
Technical Writers or Documentation Specialists skilled at comparing instructions vs. implementation
Also a strong fit:
Backend or Full-Stack Developers comfortable with function calls APIs and test workflows
DevOps or SRE professionals familiar with tool orchestration and system behavior analysis
Languages and Tools:
Proficiency in Python is helpful (most transcripts are Python-based)
Familiarity with other languages like JavaScript TypeScript Java C Go Ruby Rust or Bash is a plus
Comfort with Git workflows testing frameworks and debugging tools is valuable
More About the Opportunity
Remote and asynchronous complete tasks on your own schedule
Must complete each transcript batch within 5 hours of starting (unlimited tasks to be done)
Flexible task-based engagement with potential for recurring batches
Compensation & Contract Terms
Competitive hourly rates based on geography and experience
Contractors will be classified as independent service providers
Payments issued weekly via Stripe Connect
Application Process
Submit your resume to begin
If selected youll receive rubric documentation and access to the evaluation platform
Most applicants hear back within a few business days
About Mercor
Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations
Our investors include Benchmark General Catalyst Adam DAngelo Larry Summers and Jack Dorsey
Thousands of professionals across law engineering and research contribute to frontier AI projects via Mercor
Mercor is partnering with a top AI research organization to evaluate and improve how coding assistants reason act and communicate during development workflows. Were seeking technically sharp experts (especially those with experience in code review testing or documentation) to assess full transcripts...
Mercor is partnering with a top AI research organization to evaluate and improve how coding assistants reason act and communicate during development workflows. Were seeking technically sharp experts (especially those with experience in code review testing or documentation) to assess full transcripts of userAI coding conversations. This short-term fully remote engagement helps shape the future of developer-assisting AI systems.
Key Responsibilities
Review long-form transcripts between users and AI coding assistants
Analyze the AIs logic execution and stated actions in detail
Score each transcript using a 10-point rubric across multiple criteria
Optionally write brief justifications citing examples from the dialogue
Detect mismatches between claims and actions (e.g. saying Ill run tests but not doing so)
Ideal Qualifications
Top choices:
Senior or Staff Engineers with deep code review experience and execution insight
QA Engineers with strong verification and consistency-checking habits
Technical Writers or Documentation Specialists skilled at comparing instructions vs. implementation
Also a strong fit:
Backend or Full-Stack Developers comfortable with function calls APIs and test workflows
DevOps or SRE professionals familiar with tool orchestration and system behavior analysis
Languages and Tools:
Proficiency in Python is helpful (most transcripts are Python-based)
Familiarity with other languages like JavaScript TypeScript Java C Go Ruby Rust or Bash is a plus
Comfort with Git workflows testing frameworks and debugging tools is valuable
More About the Opportunity
Remote and asynchronous complete tasks on your own schedule
Must complete each transcript batch within 5 hours of starting (unlimited tasks to be done)
Flexible task-based engagement with potential for recurring batches
Compensation & Contract Terms
Competitive hourly rates based on geography and experience
Contractors will be classified as independent service providers
Payments issued weekly via Stripe Connect
Application Process
Submit your resume to begin
If selected youll receive rubric documentation and access to the evaluation platform
Most applicants hear back within a few business days
About Mercor
Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations
Our investors include Benchmark General Catalyst Adam DAngelo Larry Summers and Jack Dorsey
Thousands of professionals across law engineering and research contribute to frontier AI projects via Mercor
View more
View less