Major Incident and Problem Manager, Associate
Job Summary
About this role
Team Overview
The Service Management team provides industrystandard Incident Problem and Change Management alongside infrastructure operational support for Aladdin. Weoperateusing modern engineering practices and tooling including ServiceNow and AIenabled workflows and measure outcomes through clear operational metrics.
Incident Managementis responsible forrestoring service during production incidents and driving scalable stability improvements across BlackRock and its Aladdin clients.
BlackRockoperatesa 24/7 Major Incident Management function supporting global clients across Europe the Americas AsiaPacificand India. This role is based in Edinburgh and isto cover core European hours between 09:00 and 18:00 Monday to Sunday with rotational weekend working.
Role
We are seeking an experienced Incident & Problem Manager (5 years) with a strong passion for technical troubleshooting and the ability to lead multiple simultaneous incidents.
This role exists to deliver rapid time to detect and time to resolve and toeliminaterepeat incidents at a system level byoperatingan AIfirst incident delivery model. The Major Incident & Problem Manager is accountable for turning incidents into measurable stability improvementsparticularly those caused by changeand for building an incident operating rhythm where AI handles correlationclassificationand narrative generation by default allowing humans to focus on decision quality tradeoffs and prevention.
In complex distributed platforms incidents are often slowed by manual triage fragmentedownershipand timeconsuming coordination. This role addresses those challenges by creating a decisioncentric incident response model powered by AIdriven signal correlation and automationfirst execution ensuring that:
The right responders are engaged faster
Themost likely causesareidentifiedsooner
Mitigation decisions are taken with clearer risk framing
Communicationsremainaccurateandtimely
Repeat failures are systematically removed rather than documented
The role partners closely with Engineering and SRE / DevOps teamsleveragingautomation observabilitytoolingand emerging AIdriven insights. The successful candidate will have a DevOps mindset be able to actively troubleshoot and utilise and enhance AI and automation.
The role also includes participation in continuous improvement initiatives aimed at improving the stabilityperformanceand resilience of the Aladdin platform and enhancing Service Management services.
Key Responsibilities
1. Lead major incidents as a decision authority (P1P4)
Lead endtoend management of production incidents including investigation recoveryexecutionand closure
Run incidents as a decision system driving clarity on what is known what is suspected and what action is taken next
Manage multiple simultaneous incidents whilemaintainingconsistent prioritisation and escalation
2. Operate an AIfirst incident workflow (humanvalidated humanoverridden when required)
Triage and categorise incidents using AIdriven classification with human validation and override where appropriate
Drive AIautomated ticket routing and apply riskbased escalation judgement when automation is insufficient
Ensure incident timelines and summaries are produced to a high standard using AIgenerated artefacts correcting them where
3. Supervise automated remediation and agentic responders
Supervise automated remediation and agentic responders intervening to pause override or redirect when risk requires
Ensure automated remediation is safeauditableand aligned with service ownership and operational readiness
4. Manage a robust Problem Management process to prevent incident recurrence
Ensure root causes and preventative actions are clearly captured and translated into an effective Problem Management process
Identifyincident trends and repeat patterns driving scalable remediation to reduce recurrence
Partner with Engineering and SRE / DevOps to embed learnings into automation observabilityrunbooksand readiness controls
Design build and activelymaintaina Known Error Database that functions as a realtime operational asset
Work with product teams to design build and deliver a meaningful process for addressing repeat incidents
5. Deliver executivegrade communications (AIdrafted humanapproved)
Validateapproveand issue regular communications that are conciseinformativeandappropriate forstakeholders
Ensure communications accurately reflect impact mitigation progress keyrisksand confidencebased ETAs
6. Drive continuous service improvement and regulatory alignment
Drive process and tooling changes that support operational resilience and regulatory requirements including DORA and GDPR where applicable
Provide input and ownership for continual service improvement initiatives with a primary focus on Agentic AI and its application to Incident Management
Required Experience and Capabilities (Must Have)
5 years experience in Incident and Problem Management within a production environment supporting businesscritical platforms
Strong technical troubleshooting capability with the ability to engage credibly with engineers during complex failures
Proven ability to lead multiple simultaneous incidents and drive structured recovery under pressure
DevOps mindset with comfort using observability toolingautomationand operational engineering practices
Ability to produce clear highquality communications suitable for senior stakeholders
Experience operating AI systems for triagecorrelationand narrative generation with sound judgement on when outputs require validation or override
Ability to translate repetitive incident activity into automation requirements and drive adoption with engineering partners
Advantages / Desirable Qualities
Experience working in or with FinTech or regulated environments
Knowledge of cloud platforms such as Azure and/or AWS and understanding of IaaS / PaaS / SaaS service models
Experience with Microsoft Copilot and AIenabled productivity tooling
Programming capability (e.g. Python) to automate common tasks or prototype improvements
Familiarity with configuration managementdeploymentand orchestration tooling (e.g. Ansible)
Strong data analysis skills using tools such as Splunk Grafana TableauExceland/or Power BI
Strong experience with ServiceNow and operational reporting
Our benefits
To help you stay energized engaged and inspired we offer a wide range of employee benefits including: retirement investment and tools designed to help you in building a sound financial future; access to education reimbursement; comprehensive resources to support your physical health and emotional well-being; family support programs; and Flexible Time Off (FTO) so you can relax recharge and be there for the people you care about.
Our hybrid work model
BlackRocks hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person aligned with our commitment to performance and innovation. As a new joiner you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.
About BlackRock
At BlackRock we are all connected by one mission: to help more and more people experience financial well-being. Our clients and the people they serve are saving for retirement paying for their childrens educations buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.
This mission would not be possible without our smartest investment the one we make in our employees. Its why were dedicated to creating an environment where our colleagues feel welcomed valued and supported with networks benefits and development opportunities to help them thrive.
For additional information on BlackRock please visit @blackrock Twitter: @blackrock LinkedIn: is proud to be an Equal Opportunity Employer. We evaluate qualified applicants without regard to age disability race religion sex sexual orientation and other protected characteristics at law.
Required Experience:
Manager
About Company
BlackRock is one of the world’s preeminent asset management firms and a premier provider of investment management. Find out more information here.