Position: Event Monitoring Engineer / Tech Lead
Location: Reston VA #REMOTE
Duration: 12 months contract to Hire
Should be flexible with Shift rotations- Monthly
Job Description:
- This role supports the First-to-Know capability of the Technical Operations Center (TOC) and serves as the centralized focal point for observability and event management.
- Event Monitoring Engineers monitor the performance and capacity of enterprise-wide systems applications and critical business processes using a variety of tools to identify hardware software and environmental anomalies.
- The successful candidate will proactively look for ways to improve processes ensure events are meaningful and actionable look for inefficiencies and document new processes as they evolve. A great benefit to this team would be someone proficient in scripting and coding.
- This role will require shift work. Client Technology Operation Center covers multiple types of shifts that include weekdays weekends and eventually a 24/7 operation.
- Team members will be rotating work times to cover all processes and are asked to be flexible in providing coverage outside of their normal shift hours when the need arises.
- Position is for Contract Employment and can be performed fully remote.
Required Qualifications:
-
- Associate of Arts/Associate of Science and 3years of experience or equivalent combination such as bachelors degree and 2 years experience or no degree and at least 3 years in a NOC/TOC Command Center roles.
- 3 years IT experience and understanding of performance monitoring tools
- 3 years Dynatrace monitoring experience
- 2 years operating in a command center in an Incident Management or Event Monitoring/Event Management role
- 3 years experience working with Splunk SCOM SolarWinds or other performance monitoring tools
- Required Experience with scripting languages like PowerShell Ruby Perl etc
- Ability to assess monitoring events and respond or escalate accordingly
- Knowledge and experience of system and network infrastructures such as LAN and WAN network technologies server virtualization enterprise storage area network (SAN) and backup and database technologies
- Strong analytical skills and able to collate and interpret data from various sources.
- Strong communicator both verbal and written with a natural aptitude for collaboration
- Process engineering or process management experience
- Experience working in a ServiceNow environment
- Experience with Jira and project management frameworks like Agile Scrum
- Experience reporting against and managing Service Level Agreements (SLAs)
Responsibilities include:
-
- Provide eyes-on-glass monitoring using Dynatrace and other monitoring tools
- Support a 24x7 system monitoring service to proactively identify and assess problems
- Provide oversight coordination and visibility for critical business processes
- Perform system health checks some manual some automated
- Identify investigate verify report communicate and escalate critical events
- Review device logs documentation and analysis where applicable
- Develop runbooks and manage documentation for repeatable processes (Lifecycle Management) Will follow basic triage steps monitor production systems and assure their high availability
- Facilitate and coordinate the necessary IT response to system problems
- Continuously analyze events and eliminate noise and non-actionable event trends (Continual Service Improvement)
- Provide event management support to service owners and IT managers
- Author reports trends and anomalies for KPI (Key Performance Indicators) for Event Management and Monitoring
- Communicate to stakeholders; support and facilitate open communication between all stakeholders.
Thanks & Regards
--
LAXMAN
KMM Technologies Inc.
CMMI Level 2 ISO 9001 ISO 20000 ISO 27000 Certified