Role:- Incident Manager
Address :- Oakland CA (3 Days onsite) Hybrid
Responsibilities
- Manage incident management bridge calls with support teams on-call support application teams and management. Manage escalate status and assist coordinating repair efforts for all major incidents (P1 P4).
- Regular communication updates to the Customer End-Users and other Stakeholders during the entire Incident Management cycle
- Track and document incident updates in real time
- Since Major incidents are highly escalated cases handling with presence of mind and innovation.
- Support the development and execution of change management plans to drive adoption and utilization of new processes systems and technologies.
- Reviewing changes their priority their urgency and performing risk analysis.
- Creating problem tickets and respective action items reviewing root cause analysis and its closers.
- Performing PIR and Postmortem reports.
- Leading Site reliability/Disaster Recovery/Game Day/Switchover/Failover activities.
- Experience in handling multiple monitoring tools like Service now Pager duty Slack Zoom JIRA etc.
- Perform quality audits and data analytics on incident tickets to ensure quality and uncover new trends.
- Meet the SLAs and other KPIs agreed and produce the Process Performance Reports
- Provides documentation for Known Error Data Base (KEDB) or similar depository
- Develop process and procedures that ensure Incident Management related action items are tracked and completed
- Ensuring the Process adherence meeting the Quality norms
- Provide Management reporting on Incident Metrics and Incident Management performance
Qualifications/Skills required.
- Degree in computer science Information Technology or related field.
- 7-10 years of experience in incident management or related field.
- Knowledge of Cloud services is must. ( AWS/Azure/GCP)
- Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across platform teams while avoiding common pitfalls.
- Should be able to plan and conduct site reliability testing
- Should have experience in AMS - Application Management Services.
- Knowledge of incident management/change management/problem management processes and procedures.
- Experience with and knowledge of change management principles methodologies and tools
- Excellent problem-solving and analytical skills.
- Excellent verbal & written communication and interpersonal skills.
- Ability to work independently and as part of a team.
- Ability to manage multiple tasks simultaneously.
Note : This is NOT an Infrastructure support role This is Semi technical role to support an environment which is 100% hosted over cloud and to drive Applications related issues.
Warm Regards
Ankita Katailiha
Senior Technical Recruiter
Role:- Incident Manager Address :- Oakland CA (3 Days onsite) Hybrid Responsibilities Manage incident management bridge calls with support teams on-call support application teams and management. Manage escalate status and assist coordinating repair efforts for all major incidents (P1 P4). Regul...
Role:- Incident Manager
Address :- Oakland CA (3 Days onsite) Hybrid
Responsibilities
- Manage incident management bridge calls with support teams on-call support application teams and management. Manage escalate status and assist coordinating repair efforts for all major incidents (P1 P4).
- Regular communication updates to the Customer End-Users and other Stakeholders during the entire Incident Management cycle
- Track and document incident updates in real time
- Since Major incidents are highly escalated cases handling with presence of mind and innovation.
- Support the development and execution of change management plans to drive adoption and utilization of new processes systems and technologies.
- Reviewing changes their priority their urgency and performing risk analysis.
- Creating problem tickets and respective action items reviewing root cause analysis and its closers.
- Performing PIR and Postmortem reports.
- Leading Site reliability/Disaster Recovery/Game Day/Switchover/Failover activities.
- Experience in handling multiple monitoring tools like Service now Pager duty Slack Zoom JIRA etc.
- Perform quality audits and data analytics on incident tickets to ensure quality and uncover new trends.
- Meet the SLAs and other KPIs agreed and produce the Process Performance Reports
- Provides documentation for Known Error Data Base (KEDB) or similar depository
- Develop process and procedures that ensure Incident Management related action items are tracked and completed
- Ensuring the Process adherence meeting the Quality norms
- Provide Management reporting on Incident Metrics and Incident Management performance
Qualifications/Skills required.
- Degree in computer science Information Technology or related field.
- 7-10 years of experience in incident management or related field.
- Knowledge of Cloud services is must. ( AWS/Azure/GCP)
- Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across platform teams while avoiding common pitfalls.
- Should be able to plan and conduct site reliability testing
- Should have experience in AMS - Application Management Services.
- Knowledge of incident management/change management/problem management processes and procedures.
- Experience with and knowledge of change management principles methodologies and tools
- Excellent problem-solving and analytical skills.
- Excellent verbal & written communication and interpersonal skills.
- Ability to work independently and as part of a team.
- Ability to manage multiple tasks simultaneously.
Note : This is NOT an Infrastructure support role This is Semi technical role to support an environment which is 100% hosted over cloud and to drive Applications related issues.
Warm Regards
Ankita Katailiha
Senior Technical Recruiter
View more
View less