Senior Software Engineer AI App Enablement & Observability

Bloomberg


Job Location:

Dublin - Ireland

Monthly Salary: Not Disclosed
Posted on: 10 hours ago
Vacancies: 1 Vacancy

Job Summary

Senior Software Engineer - AI App Enablement & Observability
Location
Dublin
Business Area
Engineering and CTO
Ref #

Description & Requirements

Platform Engineering builds the core platforms tooling and paved roads that Bloomberg engineers rely on to ship reliable secure and high-performing systems at scale.

The AI App Enablement & Observability team accelerates how AI products are built across Bloomberg Industry Group. Our mission is to make AI systems reliable performant cost-efficient and continuously improving through platform tooling deep observability and automated feedback loops.

We build developer-facing platforms and workflows that enable teams to experiment deploy and operate AI and agent-based systems with confidence. This includes LLM gateways agent platforms benchmarking systems telemetry pipelines and self-improving infrastructure that closes the loop between observability and action. We emphasise strong developer experience intuitive APIs/SDKs and end-to-end ownership.

Whats in it for you
You will help define how Bloomberg Industry Group builds and operates AI systems at scale by working on platforms that:

  • Accelerate AI product development through reusable tooling and paved roads
  • Provide end-to-end observability across AI systems (models agents pipelines applications)
  • Enable self-improving systems through telemetry-driven feedback loops
  • Optimise cost performance and reliability of AI workloads
  • Support both production AI systems and internal engineering agents

Youll collaborate across AI product infrastructure and platform teams to deliver foundational systems.

Well trust you to:
Platform & Enablement
  • Build and evolve AI platform tooling (e.g. developer workflows benchmarking systems)
  • Design developer-friendly APIs SDKs and interfaces
  • Contribute to systems across the Model Development Lifecycle (experimentation deployment evaluation)

Observability & Telemetry
  • Build and operate observability platforms and telemetry pipelines (logs metrics traces events)
  • Provide visibility into latency token usage cost quality drift and reliability
  • Define instrumentation standards schemas and conventions
  • Implement distributed tracing using modern approaches (e.g. OpenTelemetry)

AI System Insights & Debugging
  • Enable end-to-end debugging of AI and agent workflows (model calls tool usage retrieval orchestration)
  • Build benchmarking regression detection and performance analysis capabilities
  • Support observability for both production systems and internal engineering agents

Closed-loop Optimization & Automation
  • Develop systems that turn telemetry into action (automated experimentation regression detection alerting)
  • Build feedback loops that continuously improve model quality and system behavior
  • Enable self-healing and self-optimising workflows

Cost Performance & Reliability
  • Build tooling for cost visibility forecasting and optimization
  • Define SLOs alerting and performance tuning practices
  • Improve reliability and scalability of AI infrastructure

Ownership & Collaboration
  • Own projects end-to-end (RFCs architecture implementation rollout production support)
  • Partner with AI teams to drive adoption of platform tooling and standards
  • Produce high-quality documentation and improve developer experience

Youll need to have:
  • Demonstrated experience building production software or platform systems
  • Strong engineering fundamentals with distributed systems or backend platforms
  • Experience or strong interest in observability and debugging complex systems
  • Experience or strong interest in AI/ML systems LLMs or agent-based architectures
  • Strong ownership mindset and ability to drive ambiguous problems to production
  • Hands-on experience with modern agentic coding tools and multi-model workflows
  • Working knowledge of agent architecture internals (context engineering tool loops sub-agent orchestration)

Wed love to see:

  • Experience with OpenTelemetry and modern observability ecosystems including instrumentation collectors exporters and tools like Prometheus Grafana and tracing/log systems
  • Experience designing and operating telemetry pipelines including sampling retention cardinality and cost tradeoffs as well as integrating observability into CI/CD and developer workflows
  • Familiarity with AI/agent frameworks including instrumentation of LLM calls tool usage workflows and evaluation signals (quality metrics benchmarking regression detection)
  • Experience building cost monitoring forecasting and optimization systems for AI workloads
  • Familiarity with cloud and infrastructure tooling (e.g. AWS Azure Kubernetes Terraform)
  • Experience with agentic infrastructure concepts such as MCP servers hooks skills subagents sandboxing and persistent memory patterns
  • Active engagement with the agentic engineering frontier including emerging patterns (e.g. harness vs. model review debt feedback loops)
  • Demonstrated agent-native development practices (iterating with agents using testing verification and feedback loops)
  • Strong security awareness for autonomous systems including sandboxing prompt injection risks credential exposure and guardrails

If indicated please note that years of experience are a guide; we will consider applications from all candidates who can demonstrate the skills necessary for the role.

Required Experience:

Senior IC

Senior Software Engineer - AI App Enablement & Observability Location ...

About Company

Company Logo

Bloomberg is the world's primary distributor of financial data and a top news provider of the 21st century. A global information and technology company, we use our dynamic network of data, ideas and analysis to solve difficult problems every day. Our customers around the world rely on ... View more

View Profile View Profile