Our client RibbitZ is looking for Senior Software Engineer-LLM Evaluation to work remotely.
As a Software Engineering evaluator you will create cutting-edge datasets for training benchmarking and advancing large language models collaborating closely with researchers. This includes curating code examples providing precise solutions and making corrections in Python JavaScript (including ReactJS) C/C Java Rust and Go; evaluating and refining AI-generated code for efficiency scalability and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions.
What Does a Typical Day Look Like
- Working on AI model training initiatives by curating code examples building solutions and correcting code in Python JavaScript (including ReactJS) C/C Java Rust and Go.
- Evaluate and refine AI-generated code to ensure that it is efficient scalable and reliable.
- Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
- Build agents that can verify the quality of the code and identify error patterns.
- Hypothesize on steps in the software engineering cycle (prototyping architecture design API design production implementation launch experiments monitoring operational maintenance) and evaluate model capabilities on them
- Design verification mechanisms that can automatically verify a solution to a software engineering task.
Required Skills
- Several years of software engineering experience (5 years) including 2years of continuous full-time experience at a top-tier product company (e.g. Google Stripe Amazon Apple Meta Netflix Microsoft Datadog Dropbox Shopify PayPal IBM Research).
- Strong expertise in building full-stack applications and deploying scalable production-grade software using modern languages and tools.
- Deep understanding of software architecture design development debugging and code quality/review assessment.
- Excellent oral and written communication skills for clear structured evaluation rationales.
Eligibility (Strictly Enforced):
- Software Engineering profiles only
- Candidates must be based in the US
- 5 years of relevant experience
- Immediate assessment availability
Top companies:
- Google (Alphabet)
- Apple
- Amazon
- Meta (Facebook)
- Netflix
- Microsoft
- Tesla
- NVIDIA
- Adobe
- Salesforce
- Github
- Atlassian
- hashiCorp
- Databricks
- Snowflake
- Cloudflare DigitalOcean MongoDB
- Elastic Confluent Airbnb Dropbox
- Stripe Palantir Uber Lyft
- Square (Block) Twilio Snap Inc.
- Pinterest Figma Oracle Cisco
- Paypal Doordash Rivian Reddit Coinbase Splunk
- Spotify Goldman Sachs Morgan Stanley
- JP Morgan Chase Capital One
- Plaid Shopify Intuit Workday ServiceNow
- Hugging Face VMware Brex Wise
- Epic Games Unity Technologies
- Activision Blizzard Riot Games Valve
- Huawei Bloomberg ByteDance
- Alibaba Baidu Notion Klarna
- Instacart Zillow.
Our client RibbitZ is looking for Senior Software Engineer-LLM Evaluation to work remotely.As a Software Engineering evaluator you will create cutting-edge datasets for training benchmarking and advancing large language models collaborating closely with researchers. This includes curating code examp...
Our client RibbitZ is looking for Senior Software Engineer-LLM Evaluation to work remotely.
As a Software Engineering evaluator you will create cutting-edge datasets for training benchmarking and advancing large language models collaborating closely with researchers. This includes curating code examples providing precise solutions and making corrections in Python JavaScript (including ReactJS) C/C Java Rust and Go; evaluating and refining AI-generated code for efficiency scalability and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions.
What Does a Typical Day Look Like
- Working on AI model training initiatives by curating code examples building solutions and correcting code in Python JavaScript (including ReactJS) C/C Java Rust and Go.
- Evaluate and refine AI-generated code to ensure that it is efficient scalable and reliable.
- Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
- Build agents that can verify the quality of the code and identify error patterns.
- Hypothesize on steps in the software engineering cycle (prototyping architecture design API design production implementation launch experiments monitoring operational maintenance) and evaluate model capabilities on them
- Design verification mechanisms that can automatically verify a solution to a software engineering task.
Required Skills
- Several years of software engineering experience (5 years) including 2years of continuous full-time experience at a top-tier product company (e.g. Google Stripe Amazon Apple Meta Netflix Microsoft Datadog Dropbox Shopify PayPal IBM Research).
- Strong expertise in building full-stack applications and deploying scalable production-grade software using modern languages and tools.
- Deep understanding of software architecture design development debugging and code quality/review assessment.
- Excellent oral and written communication skills for clear structured evaluation rationales.
Eligibility (Strictly Enforced):
- Software Engineering profiles only
- Candidates must be based in the US
- 5 years of relevant experience
- Immediate assessment availability
Top companies:
- Google (Alphabet)
- Apple
- Amazon
- Meta (Facebook)
- Netflix
- Microsoft
- Tesla
- NVIDIA
- Adobe
- Salesforce
- Github
- Atlassian
- hashiCorp
- Databricks
- Snowflake
- Cloudflare DigitalOcean MongoDB
- Elastic Confluent Airbnb Dropbox
- Stripe Palantir Uber Lyft
- Square (Block) Twilio Snap Inc.
- Pinterest Figma Oracle Cisco
- Paypal Doordash Rivian Reddit Coinbase Splunk
- Spotify Goldman Sachs Morgan Stanley
- JP Morgan Chase Capital One
- Plaid Shopify Intuit Workday ServiceNow
- Hugging Face VMware Brex Wise
- Epic Games Unity Technologies
- Activision Blizzard Riot Games Valve
- Huawei Bloomberg ByteDance
- Alibaba Baidu Notion Klarna
- Instacart Zillow.
View more
View less