Software Engineer II Risk Engineering

New York City, NY - USA

Monthly Salary: Not Disclosed

Posted on: Yesterday

Vacancies: 1 Vacancy

Job Summary

Datadog is hiring a Software Engineer II to strengthen our Risk Engineering this role you will help design and build AI-powered systems that transform how we manage risk at scale delivering practical solutions that thoughtfully balance compliance security and business objectives. Reporting to the Engineering Manager you will play a key role in scaling Datadogs risk management capabilities driving high-impact engineering outcomes and evolving our approach to meet emerging technologies and AI-driven workflows.

This role sits at the intersection of software engineering LLM systems and risk automation. Youll work primarily in Go while building and operationalizing AI-driven tooling centered around large language models (LLMs) prompt engineering evaluation frameworks structured data pipelines and intelligent risk workflows. This is not a traditional backend role were looking for an engineer excited about rapid prototyping experimenting with LLM capabilities and turning those experiments into production-grade systems that improve risk visibility and decision-making.

What Youll Do:

Develop Go-based services that integrate with LLMs to automate and augment risk workflows.
Use AI-assisted development tools to accelerate prototyping iteration and implementation.
Design and implement prompt architectures for risk classification control mapping exception analysis and policy interpretation.
Build structured evaluation frameworks to measure LLM quality hallucination rates determinism and decision accuracy.
Implement automation for risk management workflows including triage remediation tracking exception handling and integrations with internal systems to improve scalability of the program.
Build evaluation loops to continuously improve prompt performance and model outputs.
Design schemas and structured data models for risk registers control libraries policy exceptions and evidence artifacts.
Improve traceability between risks controls policies and exceptions using intelligent automation.

Who You Are:

You have 2 years of experience in software engineering building and operating production systems at scale including deploying managing and troubleshooting services in Kubernetes environments.
You have hands-on experience with a modern programming language (ideally Go and Python).
You actively use AI-assisted coding tools (e.g. Cursor Claude Code Copilot) and are comfortable building systems.
Comfort experimenting with LLM APIs (OpenAI Anthropic etc.) and building AI-powered tools.
Experience working with APIs and distributed systems.
Demonstrated ability to independently break down complex problems drive solutions and execute with minimal supervision.
Strong written and verbal communication skills with the ability to clearly articulate technical concepts through RFCs design documents and architectural diagrams.
Curiosity about AI systems and their limitations including an understanding of failure modes such as hallucination non-determinism and prompt brittleness.
Solid foundation in software development best practices including code quality testing methodologies and maintainable scalable system design.
Self-motivated and able to take initiative in building programs that scale impact across the organization.

Required Experience:

What Youll Do:

Develop Go-based services that integrate with LLMs to automate and augment risk workflows.
Use AI-assisted development tools to accelerate prototyping iteration and implementation.
Design and implement prompt architectures for risk classification control mapping exception analysis and policy interpretation.
Build structured evaluation frameworks to measure LLM quality hallucination rates determinism and decision accuracy.
Implement automation for risk management workflows including triage remediation tracking exception handling and integrations with internal systems to improve scalability of the program.
Build evaluation loops to continuously improve prompt performance and model outputs.
Design schemas and structured data models for risk registers control libraries policy exceptions and evidence artifacts.
Improve traceability between risks controls policies and exceptions using intelligent automation.

Who You Are:

You have 2 years of experience in software engineering building and operating production systems at scale including deploying managing and troubleshooting services in Kubernetes environments.
You have hands-on experience with a modern programming language (ideally Go and Python).
You actively use AI-assisted coding tools (e.g. Cursor Claude Code Copilot) and are comfortable building systems.
Comfort experimenting with LLM APIs (OpenAI Anthropic etc.) and building AI-powered tools.
Experience working with APIs and distributed systems.
Demonstrated ability to independently break down complex problems drive solutions and execute with minimal supervision.
Strong written and verbal communication skills with the ability to clearly articulate technical concepts through RFCs design documents and architectural diagrams.
Curiosity about AI systems and their limitations including an understanding of failure modes such as hallucination non-determinism and prompt brittleness.
Solid foundation in software development best practices including code quality testing methodologies and maintainable scalable system design.
Self-motivated and able to take initiative in building programs that scale impact across the organization.

Required Experience: