drjobs
Site Reliability Engineer ITIL Incident Management
drjobs
Site Reliability Eng....
Elite Mente LLC
drjobs Site Reliability Engineer ITIL Incident Management العربية

Site Reliability Engineer ITIL Incident Management

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs

Job Location

drjobs

USA

Monthly Salary

drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Req ID : 2838753

Position: Site Reliability Engineer (ITIL) Incident Management

Location: 100% Remote (CDT Hours)

Duration: 6 Months Contract to Hire

USC/GC only

Interview: Video

Role Overview:

Our IT Service Management (ITSM) team is composed of highly skilled Incident Problem Change and Event Management professionals. We are seeking a dynamic and motivated individual to join our team. As an IT Incident and Data Analyst you will be responsible for managing and improving the delivery of Clients internal IT Service Management services help identify and address potential systems issues and ensure our systems are setup to properly alert our technical teams. You will work closely with our technical teams to ensure that our systems are monitoring and alerting efficiently and effectively.

You will be responsible for managing the process to define the endtoend lifecycle of incidents events problems and changes within the organization to ensure effective resolution and prevention of future occurrences. The ideal candidate will have a strong background in IT service management incident / problem / change event management with the ability to lead and coordinate crossfunctional teams to drive continuous improvement within the organization.

Key Areas of Responsibility:

  • This is a strategic and handson position where you will work closely with crossfunctional teams to develop and optimize Service Management Processes (Incident/Problem/Change/Event management) drive continuous improvement and enhance our proactive capabilities.
  • Monitor system management consoles and respond to alerts.
  • Facilitate Major Incident conference calls independently performing multiple roles including Situation Leader Scribe and Communications to executive Leadership.
  • Lead and coordinate the endtoend incident management process from detection and diagnosis to resolution and postincident analysis including RCAs to ensure correct monitoring and automated alerting is in place to prevent any repetition.
  • Help increase problem tracking and root cause analysis and availability of products across Technology.
  • Proactively detect and prevent future problems/incidents and initiate the Problem Management process to allow quicker diagnosis and resolution.
  • Conduct weekly Change Advisory Board calls etc. and tooling automation (requirements testing adoption) to support Change Management Operations.
  • Develop trend analysis and prepare service improvement plans to address identified gaps.
  • Implement and enforce OLAs/SLAs to ensure effective governance of change requests through the Change Management lifecycle.
  • Define and inspect metrics KPI and trend reports for use in the problem management process.
  • Build strong relationships with key stakeholders including senior management department heads and external partners to ensure their support and engagement in incident management initiatives.
  • Foster a culture of continuous improvement staying abreast of industry trends emerging technologies and best practices to enhance incident management capabilities.
  • Create dashboards and reports to provide insights into operational performance and health.
  • Build automation to optimize processes and workflows within our oncall systems and monitoring platforms.
  • Complete any assigned project work or tasks with a view to improving existing processes capabilities and seek out automation opportunities.
  • Collaborate with engineering teams to ensure that incident learnings are integrated into the software development lifecycle to improve overall system resilience.
  • Ability to support oncall rotation and offhours support as required.
  • Qualifications:

    Minimum Qualifications:

  • Bachelors Degree in IT Business Management or a related discipline preferred.
  • 5 of direct experience in Change Incident and Problem management methodologies and processes.
  • 3 years of technical experience: systems engineering SRE DevOps software engineering.
  • 4 years of direct experience with ITSM tools (ServiceNow a plus).
  • 4 years of direct experience with CMDB tools (ServiceNow a plus).
  • Other Required Qualifications:

  • Excellent written and verbal communication skills with the ability to communicate effectively with all stakeholders including senior leadership.
  • Strong ability to understand accurately translate and produce technical information for a general and business audience.
  • Strong experience with change incident and problem management principles methodologies and tools.
  • Experience using configuration and change tools to include such as ServiceNow Change and CMDB and or related tools.
  • Experience with project delivery methodologies (Agile Scrum).
  • Hands on experience with monitoring and performance monitoring tools: DataDog Dynatrace Splunk etc.
  • Preferred Qualifications:

    • ITIL v3 Foundation Certification Preferred.
    • Certification in Project Management.
    • Experience implementing continuous process improvements within a configuration change release or asset management program.
    • Cloud certifications (Azure AWS GCP).
    • Direct experience scripting in two of the following languages: Python PowerShell Bash.
    • Proficient at technical and business writing.

    Employment Type

    Remote

    Company Industry

    Key Skills

    • Kubernetes
    • FMEA
    • Continuous Improvement
    • Elasticsearch
    • Go
    • Root cause Analysis
    • Maximo
    • CMMS
    • Maintenance
    • Mechanical Engineering
    • Manufacturing
    • Troubleshooting

    About Company

    Report This Job
    Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.