System Admin,Sr( Monitoring Nagios)
Job Summary
Job Description - L3 Monitoring Tools Engineer (Nagios & SCOM)
Location: Noida
Experience: 8-15 Years
Role Overview
We are looking for a Senior (L3) Monitoring Tools Engineer with strong expertise in Nagios and Microsoft SCOM to lead architecture design implementation optimization SOW creation and transition of enterprise monitoring environments.
The role requires deep technical expertise solutioning capability stakeholder interaction and ownership of large-scale monitoring transformation programs.
Key Responsibilities
Monitoring Architecture & Design (Nagios & SCOM)
- Design enterprise-grade monitoring architecture for hybrid environments (Windows Linux Network Applications).
- Lead installation configuration and upgrade of Nagios (Core/XI) and SCOM platforms.
- Define monitoring standards naming conventions alert taxonomy and threshold frameworks.
- Design distributed monitoring setups gateway architecture HA & DR strategy.
- Implement custom plugins (Nagios) and Management Pack customization (SCOM).
- Configure overrides monitors rules distributed applications dashboards and reporting.
SOW Creation & Solutioning
- Lead pre-sales technical discussions for monitoring solutions.
- Define Scope of Work (SOW) effort estimation timelines and implementation roadmap.
- Perform infrastructure assessment and monitoring gap analysis.
- Create HLD/LLD for monitoring deployment.
- Define migration strategy from legacy tools to Nagios/SCOM.
- Provide cost optimization recommendations (license & infra sizing).
- Present solution architecture to client stakeholders.
Transition & Knowledge Transfer
- Lead end-to-end transition of monitoring environments from incumbent teams.
- Create transition plans including knowledge capture risk register and mitigation strategy.
- Define monitoring coverage matrix aligned with SLA and business criticality.
- Develop SOPs runbooks and operational playbooks.
- Conduct KT sessions and certify L1/L2 teams.
- Establish governance model for steady-state monitoring operations.
- Ensure zero disruption during monitoring tool migration or transition.
Alert Engineering & Optimization
- Perform threshold engineering and noise reduction initiatives.
- Design alert suppression correlation and dependency mapping.
- Integrate monitoring tools with ServiceNow (Event Management & Auto-ticketing).
- Reduce false positives and improve MTTR.
- Implement auto-resolution for repetitive alerts using scripting.
Automation & Integration
- Develop PowerShell automation scripts for bulk configuration & alert updates.
- Implement API-based integrations with ITSM platforms.
- Align monitoring with CMDB and service mapping.
- Drive continuous improvement through automation initiatives.
Incident & Governance Support
- Provide L3 support during P1/P2 incidents and perform monitoring validation.
- Participate in CAB PIR and audit discussions.
- Maintain compliance with ITIL & ISO 20000 frameworks.
- Provide monthly monitoring performance & improvement reports.
Technical Skills Required
Primary Tools
- Nagios (Core/XI) - Advanced configuration plugin development distributed monitoring.
- Microsoft SCOM - Management Packs overrides health model tuning gateway setup.
Secondary / Supporting Skills
- SNMP WMI NRPE NSClient
- Windows Server & Linux administration basics
- PowerShell scripting
- ServiceNow integration
- Monitoring of Network Virtualization DB Middleware & Applications
Key Competencies
- Solution Architecture & Design
- SOW Drafting & Effort Estimation
- Client Communication & Stakeholder Management
- Transition & Transformation Leadership
- Monitoring Strategy Development
- Cost & License Optimization
- Documentation & Governance
KPIs for L3 Role
- Successful monitoring transition within defined timelines.
- 20-40% alert noise reduction through optimization.
- Zero major monitoring gaps post-transition.
- Improved MTTR through better alert quality.
- Successful delivery of SOW-based implementations within budget.
This JD Positions the Role As:
Monitoring Architect
L3 SME
Transition Lead
Pre-Sales & Solutioning Contributor
Platform Owner
Job Description - L3 Monitoring Tools Engineer (Nagios & SCOM)
Location: Noida
Experience: 8-15 Years
Role Overview
We are looking for a Senior (L3) Monitoring Tools Engineer with strong expertise in Nagios and Microsoft SCOM to lead architecture design implementation optimization SOW creation and transition of enterprise monitoring environments.
The role requires deep technical expertise solutioning capability stakeholder interaction and ownership of large-scale monitoring transformation programs.
Key Responsibilities
Monitoring Architecture & Design (Nagios & SCOM)
- Design enterprise-grade monitoring architecture for hybrid environments (Windows Linux Network Applications).
- Lead installation configuration and upgrade of Nagios (Core/XI) and SCOM platforms.
- Define monitoring standards naming conventions alert taxonomy and threshold frameworks.
- Design distributed monitoring setups gateway architecture HA & DR strategy.
- Implement custom plugins (Nagios) and Management Pack customization (SCOM).
- Configure overrides monitors rules distributed applications dashboards and reporting.
SOW Creation & Solutioning
- Lead pre-sales technical discussions for monitoring solutions.
- Define Scope of Work (SOW) effort estimation timelines and implementation roadmap.
- Perform infrastructure assessment and monitoring gap analysis.
- Create HLD/LLD for monitoring deployment.
- Define migration strategy from legacy tools to Nagios/SCOM.
- Provide cost optimization recommendations (license & infra sizing).
- Present solution architecture to client stakeholders.
Transition & Knowledge Transfer
- Lead end-to-end transition of monitoring environments from incumbent teams.
- Create transition plans including knowledge capture risk register and mitigation strategy.
- Define monitoring coverage matrix aligned with SLA and business criticality.
- Develop SOPs runbooks and operational playbooks.
- Conduct KT sessions and certify L1/L2 teams.
- Establish governance model for steady-state monitoring operations.
- Ensure zero disruption during monitoring tool migration or transition.
Alert Engineering & Optimization
- Perform threshold engineering and noise reduction initiatives.
- Design alert suppression correlation and dependency mapping.
- Integrate monitoring tools with ServiceNow (Event Management & Auto-ticketing).
- Reduce false positives and improve MTTR.
- Implement auto-resolution for repetitive alerts using scripting.
Automation & Integration
- Develop PowerShell automation scripts for bulk configuration & alert updates.
- Implement API-based integrations with ITSM platforms.
- Align monitoring with CMDB and service mapping.
- Drive continuous improvement through automation initiatives.
Incident & Governance Support
- Provide L3 support during P1/P2 incidents and perform monitoring validation.
- Participate in CAB PIR and audit discussions.
- Maintain compliance with ITIL & ISO 20000 frameworks.
- Provide monthly monitoring performance & improvement reports.
Technical Skills Required
Primary Tools
- Nagios (Core/XI) - Advanced configuration plugin development distributed monitoring.
- Microsoft SCOM - Management Packs overrides health model tuning gateway setup.
Secondary / Supporting Skills
- SNMP WMI NRPE NSClient
- Windows Server & Linux administration basics
- PowerShell scripting
- ServiceNow integration
- Monitoring of Network Virtualization DB Middleware & Applications
Key Competencies
- Solution Architecture & Design
- SOW Drafting & Effort Estimation
- Client Communication & Stakeholder Management
- Transition & Transformation Leadership
- Monitoring Strategy Development
- Cost & License Optimization
- Documentation & Governance
KPIs for L3 Role
- Successful monitoring transition within defined timelines.
- 20-40% alert noise reduction through optimization.
- Zero major monitoring gaps post-transition.
- Improved MTTR through better alert quality.
- Successful delivery of SOW-based implementations within budget.
This JD Positions the Role As:
Monitoring Architect
L3 SME
Transition Lead
Pre-Sales & Solutioning Contributor
Platform Owner
Required Experience:
Senior IC
About Company
Created in 1987, Stefanini is a $1B global IT provider of business solutions with locations in 40 countries across the Americas, Europe, Australia and Asia. With more than 25,000 employees, Stefanini provides onshore, offshore and nearshore IT services, including application developme ... View more