Senior Specialist Systems Administration
Job Summary
Company:
MercerDescription:
We are seeking a talented individual to join our team at Mercer. This role will be based in Toronto. This is a hybrid role that has a requirement of working at least three days a week in the office.
The Observability Engineer is a seasoned professional to leverage telemetry and observability to drive sitereliability improvements optimize instrumentation and deliver consolidated alerting and servicelevel reporting. You will manage and tune enterprise monitoring toolsets design integrations and automation for remediation and selfservice apply ITIL practices within an agile model lead implementations mentor junior staff and align monitoring activities with business needs.
We will count on you to:
- Use telemetry and observability data to drive site-reliability improvements optimize instrumentation and support consolidated alerting and service-level status across platforms.
- Configure maintain and tune enterprise monitoring and observability toolsets; design and implement integrations for unified alerting and status reporting.
- Partner regularly with application platform and stakeholder teams to ensure deployed instrumentation is effective and to optimize telemetry dashboards and reports for actionable insights.
- Identify and implement automation opportunities (self-service simplified administration automated remediation and deployments) to improve operational efficiency.
- Apply ITIL practices (Event/Availability Service Level Problem Change and Configuration Management) and participate actively in agile planning and team collaboration.
- Lead complex implementations and projects mentor junior staff and communicate project status risks and issues clearly to technical and business audiences.
- Maintain a strong customer focus: manage expectations build user trust and align monitoring and reliability activities with business objectives.
What you need to have:
- Minimum 10 years of IT experience ideally within mid- to large-sized global organizations.
- Deep understanding of application service composition and crosstechnology dependencies (application services databases and infrastructure).
- At least 5 years of handson experience with system and application monitoring APM synthetics and realuser monitoring using technologies such as OpenTelemetry Prometheus Datadog Splunk LogicMonitor BigPanda ServiceNow ITOM or comparable tools (Dynatrace New Relic Elastic).
- Proven experience using ServiceNow (Incident Configuration/CMDB Problem and Change modules) to manage work incidents and configuration items.
- Comfortable working in agile teams and using tools like JIRA or Azure DevOps for backlog management sprint planning and tracking.
What makes you stand out:
- Hands-on experience with ServiceNow (Incident Configuration Problem Change) to manage and update assigned work coupled with demonstrated application of core project management approaches to plan track and deliver tasks.
- Proficient with Datadog platform administration and monitoring capabilities and experienced deploying and configuring OpenTelemetry including planning optimization and migrations from other toolsets.
- Instrumentation expertise to align user experience with service optimization and telemetry and the ability to work independently while providing clear timely updates to peers and stakeholders.
Why join our team:
- We help you be your best through professional development opportunities interesting work and supportive leaders.
- We foster a vibrant and inclusive culture where you can work with talented colleagues to create new solutions and have an impact on colleagues clients and communities.
- Our scale enables us to provide a range of career opportunities as well as benefits and rewards to enhance your well-being.
Required Experience:
Senior IC