AI Engineer – Prompt Evaluation

Redmond, WA - USA

Hourly Salary: $ 72 - 79

Posted on: 30+ days ago

Vacancies: 1 Vacancy

The job posting is outdated and position may be filled

Job Summary

Team Red Dog is partnering with a global productivity and collaboration leader to hire an AI Engineer Prompt Evaluation to support Copilot experiences across Word Excel and PowerPoint. This onsite role in Redmond offers hands-on work in LLM prompt evaluation synthetic data generation and automation at Office scale directly influencing how millions of users interact with AI-powered productivity tools.

Top Required Skills (Must Haves):

Experience setting up synthetic tenant data and data ingestion including test accounts grounding data generation and configuration-as-code for evaluation environments.
Hands-on experience maintaining validating and automating test datasets for an LLM evaluation system with a focus on quality and repeatability.
Ability to integrate evaluation quality checks into build and deployment pipelines to ensure performance efficiency and scalability.
Strong coding or scripting skills (Python highly preferred; C# acceptable) used to support testing automation and evaluation workflows.

Opportunity Overview:
This role sits at the intersection of AI engineering experimentation and large-scale product delivery. You will help build net-new evaluation capabilities for Copilot within Office applications contributing to a rapidly evolving area of LLM prompt evaluation. The work directly supports shipping Copilot features at massive scale offering rare exposure to real-world AI systems used daily by millions of users.

How you will make an impact:
Support evaluation of Suggested User Actions across Word Excel PowerPoint and other host applications
Set up synthetic tenant data ingestion pipelines and grounding datasets for evaluation scenarios
Create and maintain evaluation test sets and configurations as code
Run evaluations inspect results and iterate with partner engineering and product teams
Automate the creation and validation of test datasets for LLM evaluation systems
Integrate evaluation quality checks into build and deployment pipelines
Perform hands-on and hands-off validation using internal evaluation toolsets
Build reporting pipelines to surface evaluation results and automate portions of the evaluation process

The expertise you bring:
Bachelors degree in computer science computer engineering or a related technical field
24 years of professional experience in software engineering data science or experimentation-focused roles
Strong foundation in computer science fundamentals including data structures algorithms and software design
Experience with large systems software development and testing
Proven ability to troubleshoot unit test and validate both new and legacy systems
Programming experience with demonstrated problem diagnosis and resolution skills

What makes a candidate highly successful in this role:
Successful candidates bring prior experience with LLM prompt engineering and evaluation synthetic data generation and experimentation workflows. They are comfortable working across testing automation and coding tasks can reason about evaluation quality at scale and collaborate effectively with partner teams to iterate quickly based on results.

Why Work with Team Red Dog
At Team Red Dog people are at the heart of everything we do. Our commitment to personalized service and our deep experience in matching talented professionals with meaningful roles at some of the worlds most inspiring companies is what sets us apart. We take the time to understand your unique skills strengths and passionsbecause we believe your career should reflect who you are.

Whether youre looking to grow pivot or simply find a place where your work truly matters we offer opportunities that empower you to make a positive impact. With excellent benefits a supportive team and a role where you can thrive while doing what you love were here to help you take the next step with confidence. Join usand discover what it means to be genuinely valued in your career.

Generous benefits package for qualified employees includes:
Health insurance (medical dental vision and life)
Employer-matched 401K plan
Paid time off
Paid holidays
Profit sharing

Estimated Start Date: January 1 2026
Location: Onsite Redmond WA
Job #: 2427
Job Type and Estimated Duration: W2/Contract 40 hours per week through 6/30/2026

Rate: $72 $79/hour

Team Red Dog is committed to providing equal opportunities to everyone regardless of race ethnicity gender age religion sexual orientation disability or any other characteristic. If you need accommodation during the recruitment process reach out to and we will work to ensure an accessible experience. We strictly adhere to federal state and local laws to maintain a workplace free from discrimination and harassment.
We offer competitive compensation aligned with U.S. industry standards and our final offer will reflect the candidates location job-specific skills experience and knowledge.

All applicants must be authorized to work in the U.S. without the need for sponsorship.
Team Red Dog is an E-Verify employer.
Employment is contingent upon the successful completion of a reference and background check.
Please no solicitations from C2C or recruiting firms.

Required Experience:

Team Red Dog is partnering with a global productivity and collaboration leader to hire an AI Engineer Prompt Evaluation to support Copilot experiences across Word Excel and PowerPoint. This onsite role in Redmond offers hands-on work in LLM prompt evaluation synthetic data generation and automation...

Top Required Skills (Must Haves):

Experience setting up synthetic tenant data and data ingestion including test accounts grounding data generation and configuration-as-code for evaluation environments.
Hands-on experience maintaining validating and automating test datasets for an LLM evaluation system with a focus on quality and repeatability.
Ability to integrate evaluation quality checks into build and deployment pipelines to ensure performance efficiency and scalability.
Strong coding or scripting skills (Python highly preferred; C# acceptable) used to support testing automation and evaluation workflows.

Generous benefits package for qualified employees includes:
Health insurance (medical dental vision and life)
Employer-matched 401K plan
Paid time off
Paid holidays
Profit sharing

Estimated Start Date: January 1 2026
Location: Onsite Redmond WA
Job #: 2427
Job Type and Estimated Duration: W2/Contract 40 hours per week through 6/30/2026

Rate: $72 $79/hour

Required Experience: