This role will be based in Mountain View CA.
At LinkedIn our approach to flexible work is centered on trust and optimized for culture connection clarity and the evolving needs of our business. The work location of this role is hybrid meaning it will be performed both from home and from a LinkedIn office on select days as determined by the business needs of the team.
HALO (Human Judgment Annotation Localization and Operations) is a horizontal team within Core AI that partners across the company to enable high-quality human judgment for AI development. We partner closely with cross-functional stakeholders and internal teams to define quality goals design evaluation and data pipelines and scale repeatable measurement systems. Our work spans multiple initiatives at once supported by shared standards platforms and best practices that help teams move faster without compromising quality.
Key Responsibilities
Lead cross-functional alignment with Engineering Product Data Science domain SMEs Trust/Legal TPM and vendor operations on evaluation strategy quality goals tradeoffs and delivery across multiple initiatives
Define and evolve evaluation frameworks for complex model and agent behaviors including rubrics rating scales defect taxonomies escalation criteria and market-specific guidance for ambiguous multi-step and high-impact use cases
Own end-to-end evaluation systems including metrics scorecards regression sets monitoring plans scenario suites and success criteria and ensure outputs are repeatable decision-useful and adopted by partner teams
Design and operationalize annotation and evaluation pipelines across internal and vendor platforms including task design QA gates adjudication approaches workflow maintenance and documentation
Drive development of human synthetic and adversarial datasets to improve evaluation coverage identify blind spots and support model iteration LLM-as-a-judge systems and reward model development
Lead calibration strategy and disagreement analysis across human and model judgments; identify drift root causes and reliability issues and translate findings into guideline updates new edge cases retraining opportunities and product quality improvements
Set and uphold quality standards for vendor and internal workforces including onboarding guideline training audit design escalation handling and cost-quality tradeoff decisions across medium-to-large programs
Lead error analysis and evaluation experiments; synthesize findings into clear recommendations and influence roadmap launch readiness and quality investments
Define requirements for human judgment and evaluation tooling and partner with Engineering on design testing rollout and adoption
Create reusable standards and best practices that scale across teams and enable partners on methodology score interpretation and appropriate use of evaluation outputs
Mentor junior team members on evaluation design annotation quality analysis methods and operational excellence
Demonstrate learning agility in a rapidly evolving field by incorporating new tools methods and research into evaluation strategy and workflows
Apply native-speaker linguistic and cultural expertise in French (France) German (Germany) Spanish (Spain) Portuguese (Brazil) or other i18n market(s) to define market-appropriate quality standards and improve consistency across locales
Qualifications :
Basic Qualifications
BA/BS in Computational Linguistics Linguistics Language Technologies or a related field
4 years of industry experience owning end-to-end human judgment operations and quality workflows for AI development
Proven experience leading medium-to-large evaluation or annotation programs in production environments
Experience working cross-functionally with partners such as Engineering Product and Data Science to drive decisions and execution
Experience developing evaluation frameworks for complex model or agent behaviors
Experience building or improving scalable evaluation or annotation workflows
Experience working with datasets and evaluation methods for LLMs or agentic systems
Experience analyzing quality signals and using findings to improve guidelines workflows or model performance
Experience with Python or an equivalent language for analysis experimentation metrics or quality validation
Ability to communicate clearly in writing and verbally including documenting decisions and aligning across functions
Preferred Qualifications
5-7 years of overall industry experience
MS or PhD in Computational Linguistics Linguistics Language Technologies or a related field
Experience in more ambiguous high-impact or fast-evolving AI product areas
Experience with LLM-as-a-judge reward modeling or model-based evaluation approaches
Experience creating standards or frameworks used across multiple teams
Experience influencing product or quality direction through evaluation insights
Experience mentoring others in evaluation annotation or quality methods
Experience supporting i18n evaluation or linguistic quality across markets
Suggested Skills
Strategic Evaluation Design for Complex AI Systems
Quality Governance and Standards at Organizational Scale
Human Judgment Vendor and Workforce Strategy
You will Benefit from our Culture
We strongly believe in the well-being of our employees and their families. That is why we offer generous health and wellness programs and time away for employees of all levels. LinkedIn is committed to fair and equitable compensation practices. The pay range for this role is $133000 - 216000. Actual compensation packages are based on several factors that are unique to each candidate including but not limited to skill set depth of experience certifications and specific work location. This may be different in other locations due to differences in the cost of labor. The total compensation package for this position may also include annual performance bonus stock benefits and/or other applicable incentive compensation plans. For more information visit Information :
Equal Opportunity Statement
We seek candidates with a wide range of perspectives and backgrounds and we are proud to be an equal opportunity employer. LinkedIn considers qualified applicants without regard to race color religion creed gender national origin age disability veteran status marital status pregnancy sex gender expression or identity sexual orientation citizenship or any other legally protected class.
LinkedIn is committed to offering an inclusive and accessible experience for all job seekers including individuals with disabilities. Our goal is to foster an inclusive and accessible workplace where everyone has the opportunity to be successful.
If you need a reasonable accommodation to search for a job opening apply for a position or participate in the interview process connect with us at and describe the specific accommodation requested for a disability-related limitation.
Reasonable accommodations are modifications or adjustments to the application or hiring process that would enable you to fully participate in that process. Examples of reasonable accommodations include but are not limited to:
A request for an accommodation will be responded to within three business days. However non-disability related requests such as following up on an application will not receive a response.
LinkedIn will not discharge or in any other manner discriminate against employees or applicants because they have inquired about discussed or disclosed their own pay or the pay of another employee or applicant. However employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information unless the disclosure is (a) in response to a formal complaint or charge (b) in furtherance of an investigation proceeding hearing or action including an investigation conducted by LinkedIn or (c) consistent with LinkedIns legal duty to furnish information.
San Francisco Fair Chance Ordinance
Pursuant to the San Francisco Fair Chance Ordinance LinkedIn will consider for employment qualified applicants with arrest and conviction records.
Pay Transparency Policy Statement
As a federal contractor LinkedIn follows the Pay Transparency and non-discrimination provisions described at this link: Data Privacy Notice for Job Candidates
Please follow this link to access the document that provides transparency around the way in which LinkedIn handles personal data of employees and job applicants: Work :
No
Employment Type :
Full-time
LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting opportunities, build necessary skills, and gain valuable insights every day. We’re ... View more