Observability Engineer
New York City, NY - USA
Job Summary
Neubergers Technology team is seeking an Observability Engineer to lead and evolve our observability strategy across cloud and on-premise environments. You will help build and operate a server monitoring platform that continuously validates service health (24/7) across business-critical systemsincluding external websites and key infrastructure components (e.g. firewalls OpenShift). You will design and implement end-to-end monitoring solutions spanning logs metrics traces Service Level Objectives (SLOs) synthetic monitoring and RUM (Real User Monitoring) to improve reliability accelerate incident response and deliver clear visibility into service performance.
This is an individual contributor role with strong engineering/scripting expectations (not a pure administrator role though admin experience is helpful). You will partner closely with application SRE/DevOps infrastructure and security teams and act as a champion/evangelist for observability tooling and standards. The environment includes a current OpenView footprint with a migration to Datadog with workflows integrating into ServiceNow for incident/ticket routing and escalation.
What youll do:
Partner closely with application DevOps engineering SRE/operations infrastructure and security teams to understand reliability goals and translate them into scalable monitoring/observability solutions across cloud and on-prem environments (Windows and Unix).
Design build and maintain scalable observability architectures and platforms with ownership of monitoring capabilities for key applications and services (application ownership).
Develop automated processes to continuously scan and validate uptime/health (24/7) for business-critical services including external-facing websites and supporting infrastructure.
Implement and optimize telemetry collection alerting dashboards and service views; drive adoption of OpenTelemetry (OTel) and consistent logging/metrics/tracing standards (core logging and platform telemetry alignment).
Define and operationalize SLOs and implement actionable alerting strategies that reduce noise and improve MTTR through correlation enrichment and threshold tuning.
Implement and evolve APM capabilities and user experience monitoring including RUM (Real User Monitoring) and synthetic monitoring approaches.
Integrate observability tooling with incident/problem management processes and ITSM workflows (e.g. Datadog ServiceNow); support ticket routing/escalation and produce runbooks post-incident reviews and executive/operational reporting.
Automate onboarding and configuration for telemetry dashboards monitors and alerts using scripting and infrastructure-as-code; ensure consistency and repeatability across Windows Server and Unix (Linux/Solaris).
Collaborate on platform evolution and cost/scale optimization continually improving coverage data quality developer experience and overall reliability outcomes.
Champion and evangelize observability practices and tooling adoption across technology teams helping incorporate new applications/tools into the monitoring platform.
Required Skills and Experience:
BS/BA in Computer Science Information Systems Engineering or equivalent experience.
5 years in Observability/APM/SRE/Platform Engineering with a track record of delivering production-grade telemetry and reliability outcomes.
Proficiency operating in both Windows Server and Unix (Linux/Solaris) environments including service instrumentation agent/collector deployment and OS-specific performance analysis.
Strong experience designing and operating distributed tracing metrics and logging standards SLOs/error budgets and actionable alerting using modern observability practices.
Hands-on experience with cloud monitoring across Azure and AWS integrating platform telemetry into centralized observability solutions.
Hands on experience with Observability/APM suites (OpenView AppDynamics Datadog) and network management tools (Network Node Manager Network Automation NetProfiler).
Scripting and automation expertise (e.g. Python PowerShell Bash) and familiarity with APIs/SDKs; experience using infrastructure-as-code to manage observability configurations (e.g. Terraform) and configuration formats (e.g. YAML).
Demonstrated ability to reduce alert noise and MTTR through correlation enrichment and threshold tuning; experience producing service maps dependency views and clear dashboards.
Excellent communication and stakeholder management skills with the ability to explain technical concepts to non-technical audiences.
Ability to work independently and collaboratively in a fast-paced environment; strong documentation habits and attention to detail.
Nice to Have
Experience development (C#) including instrumentation patterns for observability applications.
Experience in financial services or other regulated industries.
Familiarity with ITSM integrations and CMDB alignment for incident problem and change processes.
Exposure to APM and monitoring suites and event correlation approaches; knowledge of network monitoring concepts.
Experience with CI/CD integration synthetic testing strategies and performance/capacity analysis for latency-sensitive systems.
Relevant certifications in observability cloud monitoring or related platforms.
This is a hybrid position. Currently the hybrid work schedule for this position is 2-3 days in the office. Please understand that the hybrid schedule may be modified or eliminated at any time at Neubergers discretion.
Neuberger is unable to offer visa sponsorship for this position. Applicants must be authorized to work in the United States without the need for current or future sponsorship.
#LI-DD2
#LI-Hybrid
Engineer II
Compensation Details
The salary range for this role is $110000-$140000. This is the lowest to highest salary we in good faith believe we would pay for this role at the time of this posting. We may ultimately pay more or less than the posted range and the range may be modified in the future. This range is only applicable for jobs to be performed in the job posting location. An employees pay position within the salary range will be based on several factors including but limited to relevant education qualifications certifications experience skills seniority geographic location business sector performance shift travel requirements sales or revenue-based metrics market benchmarking data any collective bargaining agreements and business or organizational needs. This job is also eligible for a discretionary bonus which along with base salary and retirement contributions is part of our total comprehensive package. We offer a comprehensive package of benefits including paid time off medical/dental/vision insurance retirement life insurance and other benefits to eligible employees.Note: No amount of pay is considered to be wages or compensation until such amount is earned vested and determinable. The amount and availability of any bonus commission production or any other form of compensation that are allocable to a particular employee remains in the Companys sole discretion unless and until paid and may be modified at the Companys sole discretion consistent with the law.
Neuberger is an equal opportunity employer. The Firm and its affiliates do not discriminate in employment because of race creed national origin religion age color sex marital status sexual orientation gender identity disability citizenship status or protected veteran status or any other characteristic protected by local state or federal laws rules or regulations. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process please contact .
Learn about the Applicant Privacy Notice.
Required Experience:
IC