Job Summary
The Sr. Observability Engineer is responsible for designing deploying and optimizing clients enterprise observability ecosystem. This role delivers hands-on implementation and consulting expertise focusing on LogicMonitor and modern observability practices to drive actionable insights predictive analytics and operational excellence across infrastructure network and application layers.
Key Responsibilities
- Deploy configure and optimize LogicMonitor for enterprise-scale observability.
- Design and build custom dashboards for actionable insights and performance monitoring.
- Implement and manage data analytics workflows including advanced scripting in Python for automation and reporting.
- Integrate and manage data pipelines leveraging Kafka and related streaming technologies.
- Ensure seamless data flow into Grafana for visualization and monitoring.
- Develop and maintain integrations between observability ITSM (ServiceNow) and event management tools (PagerDuty Slack BigPanda).
- Standardize alert thresholds escalation paths and telemetry mappings across global regions.
- Define and maintain event alert and rule logic to ensure accurate correlation and minimal noise.
- Manage data ingestion pipelines from SNMP syslog APIs and third-party sources into LogicMonitor and downstream analytics systems.
- Advise on and implement AI-driven observability tools including Amazon Bedrock to enhance predictive analytics and anomaly detection.
- Partner with network server and application teams to validate data flows performance metrics and dependency mapping.
- Automate configuration and onboarding processes via API and scripting (Python PowerShell REST).
- Support incident and problem management teams by correlating events across multiple tools to accelerate root cause analysis.
- Document integrations processes and governance models for sustained operational excellence.
- Serve as technical SME supporting observability tool upgrades testing and cross-platform enhancements.
- Collaborate with stakeholders to align monitoring strategies with business objectives.
Core Expertise & Skills
- LogicMonitor platform deployment configuration and optimization.
- Deep understanding of observability frameworks best practices and enterprise monitoring.
- Python scripting for data analytics automation and advanced reporting.
- Kafka-based data streaming and integration.
- Grafana dashboarding and visualization.
- Experience with AI platforms and emerging observability technologies including Amazon Bedrock.
- Familiarity with ITSM and event management systems (ServiceNow PagerDuty BigPanda).
- Telemetry protocols (SNMP syslog NetFlow APIs) and data flow architecture.
- Strong understanding of alerting correlation logic and performance baselining.
- Excellent analytical documentation and communication skills; ability to work across engineering and operations teams.
- 510 years of experience in network monitoring observability or infrastructure engineering roles.
- Hands-on experience with LogicMonitor Splunk ThousandEyes Datadog Dynatrace Cisco DNAC and related platforms.
- Scripting and automation experience (Python PowerShell REST APIs)
- Looking for an expert in the Logic Monitor platform
- Customer team is decommissioning 4 platforms into LogicMonitor
Location: On-Site 3 days/week Orlando FL preferred
Duration: 12 month extendable contract
Required Qualifications
Required Skills:
Sr. Network Automation Engineer - Hybrid on site 2 days/week Santa Clara CA About the Role Were seeking a hands-on Infrastructure Systems Developer who thrives in building full-stack systems with a focus on network automation. This is not your traditional network engineering role were looking for someone who comes from the DevOps or systems development world and has ventured into networking by building tools platforms and automation frameworks that interact with network infrastructure. You will own the architecture and development of a full-stack system that ingests stores and acts on network telemetry and configuration data from backend frameworks to frontend UI to device interaction and automation. Key Responsibilities Design and architect an end-to-end automation system for network configuration and telemetry. Choose and implement the right technologies: Database: SQL NoSQL or MDM SQL solutions. Backend: Python (FastAPI Flask) Go or similar. Frontend: React Vue or modern JS framework. Build integrations with CI/CD pipelines (e.g. Jenkins). Implement configuration management and telemetry collection using Ansible SaltStack or similar tools. Create APIs and services to interface with network devices. Ensure scalable data storage and retrieval for network metadata and telemetry. Collaborate with network engineers DevOps and security teams. Requirements Strong programming skills in Python Go or equivalent languages. Experience designing and building production-level infrastructure systems. Deep understanding of system architecture and software lifecycle. Familiarity with network automation concepts and tools even if not a traditional network engineer. Hands-on experience with: CI/CD: Jenkins or equivalent. Config Management: Ansible Salt. Database systems: SQL and NoSQL (MongoDB PostgreSQL etc.). Frontend frameworks: React Vue or similar. Comfort working across the full stack and owning the entire lifecycle of a system. Nice to Have Exposure to network protocols and device-level APIs (e.g. NETCONF RESTCONF). Experience in telemetry collection parsing and visualization. Contributions to open-source DevOps or automation tools. Experience with MDM/metadata modeling.
Required Education:
Masters preferred
Job SummaryThe Sr. Observability Engineer is responsible for designing deploying and optimizing clients enterprise observability ecosystem. This role delivers hands-on implementation and consulting expertise focusing on LogicMonitor and modern observability practices to drive actionable insights pre...
Job Summary
The Sr. Observability Engineer is responsible for designing deploying and optimizing clients enterprise observability ecosystem. This role delivers hands-on implementation and consulting expertise focusing on LogicMonitor and modern observability practices to drive actionable insights predictive analytics and operational excellence across infrastructure network and application layers.
Key Responsibilities
- Deploy configure and optimize LogicMonitor for enterprise-scale observability.
- Design and build custom dashboards for actionable insights and performance monitoring.
- Implement and manage data analytics workflows including advanced scripting in Python for automation and reporting.
- Integrate and manage data pipelines leveraging Kafka and related streaming technologies.
- Ensure seamless data flow into Grafana for visualization and monitoring.
- Develop and maintain integrations between observability ITSM (ServiceNow) and event management tools (PagerDuty Slack BigPanda).
- Standardize alert thresholds escalation paths and telemetry mappings across global regions.
- Define and maintain event alert and rule logic to ensure accurate correlation and minimal noise.
- Manage data ingestion pipelines from SNMP syslog APIs and third-party sources into LogicMonitor and downstream analytics systems.
- Advise on and implement AI-driven observability tools including Amazon Bedrock to enhance predictive analytics and anomaly detection.
- Partner with network server and application teams to validate data flows performance metrics and dependency mapping.
- Automate configuration and onboarding processes via API and scripting (Python PowerShell REST).
- Support incident and problem management teams by correlating events across multiple tools to accelerate root cause analysis.
- Document integrations processes and governance models for sustained operational excellence.
- Serve as technical SME supporting observability tool upgrades testing and cross-platform enhancements.
- Collaborate with stakeholders to align monitoring strategies with business objectives.
Core Expertise & Skills
- LogicMonitor platform deployment configuration and optimization.
- Deep understanding of observability frameworks best practices and enterprise monitoring.
- Python scripting for data analytics automation and advanced reporting.
- Kafka-based data streaming and integration.
- Grafana dashboarding and visualization.
- Experience with AI platforms and emerging observability technologies including Amazon Bedrock.
- Familiarity with ITSM and event management systems (ServiceNow PagerDuty BigPanda).
- Telemetry protocols (SNMP syslog NetFlow APIs) and data flow architecture.
- Strong understanding of alerting correlation logic and performance baselining.
- Excellent analytical documentation and communication skills; ability to work across engineering and operations teams.
- 510 years of experience in network monitoring observability or infrastructure engineering roles.
- Hands-on experience with LogicMonitor Splunk ThousandEyes Datadog Dynatrace Cisco DNAC and related platforms.
- Scripting and automation experience (Python PowerShell REST APIs)
- Looking for an expert in the Logic Monitor platform
- Customer team is decommissioning 4 platforms into LogicMonitor
Location: On-Site 3 days/week Orlando FL preferred
Duration: 12 month extendable contract
Required Qualifications
Required Skills:
Sr. Network Automation Engineer - Hybrid on site 2 days/week Santa Clara CA About the Role Were seeking a hands-on Infrastructure Systems Developer who thrives in building full-stack systems with a focus on network automation. This is not your traditional network engineering role were looking for someone who comes from the DevOps or systems development world and has ventured into networking by building tools platforms and automation frameworks that interact with network infrastructure. You will own the architecture and development of a full-stack system that ingests stores and acts on network telemetry and configuration data from backend frameworks to frontend UI to device interaction and automation. Key Responsibilities Design and architect an end-to-end automation system for network configuration and telemetry. Choose and implement the right technologies: Database: SQL NoSQL or MDM SQL solutions. Backend: Python (FastAPI Flask) Go or similar. Frontend: React Vue or modern JS framework. Build integrations with CI/CD pipelines (e.g. Jenkins). Implement configuration management and telemetry collection using Ansible SaltStack or similar tools. Create APIs and services to interface with network devices. Ensure scalable data storage and retrieval for network metadata and telemetry. Collaborate with network engineers DevOps and security teams. Requirements Strong programming skills in Python Go or equivalent languages. Experience designing and building production-level infrastructure systems. Deep understanding of system architecture and software lifecycle. Familiarity with network automation concepts and tools even if not a traditional network engineer. Hands-on experience with: CI/CD: Jenkins or equivalent. Config Management: Ansible Salt. Database systems: SQL and NoSQL (MongoDB PostgreSQL etc.). Frontend frameworks: React Vue or similar. Comfort working across the full stack and owning the entire lifecycle of a system. Nice to Have Exposure to network protocols and device-level APIs (e.g. NETCONF RESTCONF). Experience in telemetry collection parsing and visualization. Contributions to open-source DevOps or automation tools. Experience with MDM/metadata modeling.
Required Education:
Masters preferred
View more
View less