The APM Features team builds intelligent troubleshooting experiences that help customers quickly understand and resolve performance issues in complex distributed systems. We work at the intersection of observability product and AI transforming large volumes of noisy telemetry data into clear explanations insights and actionable conclusions.
This team is in an early highly exploratory phase of applying LLMs and agentic workflows to real production APM problems. Engineers collaborate closely to prototype test and iterate on ideas learning from experimentation and focusing on useful reliable customer outcomes. Correctness clarity and product impact are central to how we work.
As one of the first AI Engineers in EMEA for this team youll help shape how AI engineering is practiced locally working closely with peers to influence the future of AI-powered troubleshooting at Datadog.
At Datadog we place value in our office culture - the relationships that it builds the creativity it brings to the table and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.
What youll do:
- Design and build AI-powered troubleshooting features for APM workflows using LLMs and agentic systems.
- Help users diagnose and resolve performance issues by synthesizing large volumes of observability data including traces metrics and logs.
- Prototype experiment and iterate on AI-driven experiences using evidence and user feedback to guide decisions and focus on real user value.
- Define inputs outputs and success criteria for LLM-based systems operating in evolving and sometimes ambiguous environments.
- Build agentic workflows with strong guardrails balancing autonomy safety correctness and reliability.
- Lead features end-to-end in collaboration with peers and partners from problem discovery through production and iteration.
- Design and maintain evaluation loops including offline evaluations benchmarks and A/B tests.
- Write and own production backend services contributing to reliable scalable systems.
Who you are:
- A senior product-minded engineer with experience shipping AI systems to production.
- Comfortable working in evolving problem spaces and proactively identifying meaningful opportunities to build.
- Hands-on experience with LLMs or agentic systems including prompting tooling evaluation and guardrails.
- Experience using AI coding tools such as Cursor Claude Code or similar with the ability to reflect on what worked what didnt and why.
- A strong sense for correctness failure modes and how to measure and improve quality in AI systems.
- Comfortable experimenting learning from outcomes and iterating thoughtfully.
- Solid ML and applied science fundamentals including experiment design and statistics.
Bonus points:
These are helpful but not required - we dont expect candidates to have experience with everything listed below.
- Exposure to agent frameworks tool-use orchestration retrieval-augmented generation (RAG) and indexing large-scale telemetry data.
- Familiarity with SLO/SLA practices and incident response.
- Hands-on experience with distributed tracing systems (OpenTelemetry Datadog APM) profilers or logs and metrics pipelines.
Distributed systems fundamentals and familiarity with observability concepts.
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. Thats okay. If youre passionate about technology and want to grow your skills we encourage you to apply.
Benefits and Growth:
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development product training and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks our Internal panel discussions
- Free global mental health benefits for employees and dependents age 6
- Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
Required Experience:
Senior IC
The APM Features team builds intelligent troubleshooting experiences that help customers quickly understand and resolve performance issues in complex distributed systems. We work at the intersection of observability product and AI transforming large volumes of noisy telemetry data into clear explana...
The APM Features team builds intelligent troubleshooting experiences that help customers quickly understand and resolve performance issues in complex distributed systems. We work at the intersection of observability product and AI transforming large volumes of noisy telemetry data into clear explanations insights and actionable conclusions.
This team is in an early highly exploratory phase of applying LLMs and agentic workflows to real production APM problems. Engineers collaborate closely to prototype test and iterate on ideas learning from experimentation and focusing on useful reliable customer outcomes. Correctness clarity and product impact are central to how we work.
As one of the first AI Engineers in EMEA for this team youll help shape how AI engineering is practiced locally working closely with peers to influence the future of AI-powered troubleshooting at Datadog.
At Datadog we place value in our office culture - the relationships that it builds the creativity it brings to the table and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.
What youll do:
- Design and build AI-powered troubleshooting features for APM workflows using LLMs and agentic systems.
- Help users diagnose and resolve performance issues by synthesizing large volumes of observability data including traces metrics and logs.
- Prototype experiment and iterate on AI-driven experiences using evidence and user feedback to guide decisions and focus on real user value.
- Define inputs outputs and success criteria for LLM-based systems operating in evolving and sometimes ambiguous environments.
- Build agentic workflows with strong guardrails balancing autonomy safety correctness and reliability.
- Lead features end-to-end in collaboration with peers and partners from problem discovery through production and iteration.
- Design and maintain evaluation loops including offline evaluations benchmarks and A/B tests.
- Write and own production backend services contributing to reliable scalable systems.
Who you are:
- A senior product-minded engineer with experience shipping AI systems to production.
- Comfortable working in evolving problem spaces and proactively identifying meaningful opportunities to build.
- Hands-on experience with LLMs or agentic systems including prompting tooling evaluation and guardrails.
- Experience using AI coding tools such as Cursor Claude Code or similar with the ability to reflect on what worked what didnt and why.
- A strong sense for correctness failure modes and how to measure and improve quality in AI systems.
- Comfortable experimenting learning from outcomes and iterating thoughtfully.
- Solid ML and applied science fundamentals including experiment design and statistics.
Bonus points:
These are helpful but not required - we dont expect candidates to have experience with everything listed below.
- Exposure to agent frameworks tool-use orchestration retrieval-augmented generation (RAG) and indexing large-scale telemetry data.
- Familiarity with SLO/SLA practices and incident response.
- Hands-on experience with distributed tracing systems (OpenTelemetry Datadog APM) profilers or logs and metrics pipelines.
Distributed systems fundamentals and familiarity with observability concepts.
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. Thats okay. If youre passionate about technology and want to grow your skills we encourage you to apply.
Benefits and Growth:
- New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
- Continuous professional development product training and career pathing
- Intradepartmental mentor and buddy program for in-house networking
- An inclusive company culture ability to join our Community Guilds (Datadog employee resource groups)
- Access to Inclusion Talks our Internal panel discussions
- Free global mental health benefits for employees and dependents age 6
- Competitive global benefits
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
Required Experience:
Senior IC
View more
View less