Oracle is seeking an Incident Manager to oversee and enhance our incident management program for data center operations. The Incident Manager will be a critical leader responsible for ensuring adherence to the incident management program during incidents managing program documentation reporting cross-functional communication Root Cause Analysis (RCA) and training. Candidates must have extensive experience in incident management program oversight and stakeholder collaboration with a focus on maintaining operational continuity safety and compliance in high-stakes data center environments.
Responsibilities
Ensure strict adherence to the incident management program during incidents guiding teams to follow established protocols for rapid response mitigation and resolution across data center operations (e.g. mechanical electrical and controls systems).
Oversee the development maintenance and continuous improvement of the incident management program including policies procedures and documentation to align with industry standards and Oracles operational goals.
Support the creation and delivery of training programs to educate data center technicians engineers and operational staff on incident management protocols ensuring preparedness and compliance.
Manage comprehensive incident reporting utilizing metrics and Key Performance Indicators (KPIs) to track incident response times resolution effectiveness and program performance driving operational excellence.
Conduct and facilitate Root Cause Analysis (RCA) processes post-incident to identify underlying issues document findings and implement corrective actions to prevent recurrence.
Collaborate cross-functionally with internal stakeholders (e.g. Data Center Facility Engineering (DCFE) operations and vendor management teams) to ensure seamless communication during incidents and align incident management with broader operational objectives.
Develop and maintain incident management documentation including playbooks escalation procedures and communication templates to ensure clarity and consistency during high-pressure situations.
Escalate critical incidents resource constraints or program gaps to senior leadership for timely resolution ensuring transparency and accountability.
Utilize data analytics to generate reports on incident trends program effectiveness and contributions to operational reliability supporting Oracles mission of excellence and teamwork.
Identify and implement process improvements in incident management training and RCA processes to enhance efficiency safety and performance in data center operations.
Minimum Qualifications
7 years of experience in incident management program oversight or operations leadership with at least 3 years in data center operations facilities management or critical infrastructure environments.
Proven expertise in developing and managing incident management programs including documentation reporting and training in high-stakes environments.
Strong proficiency in conducting Root Cause Analysis (RCA) and implementing corrective actions to improve operational resilience.
Deep understanding of metrics-driven incident evaluation cross-functional communication and stakeholder collaboration in data center operations.
Ability to interpret technical specifications operational protocols and incident data to enhance incident management and training programs.
Willingness to travel to data center sites for incident response program implementation or stakeholder engagements.
Career Level - IC5
Required Experience:
Manager
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more