Senior Observability Engineer (Production Resilience)
Job Summary
Role :Senior Observability Engineer
Senior Observability Engineer (Production Resilience) is a recent role in the organization that is in line with the evolution of the IT Production mission and Master Control Room concept. The role is in permanent dialogue with the Production monitoring teams the System Engineering Monitoring team the Application owners and Component owners and Business representatives to improve monitoring/observability identify and eliminate Production risk areas and blind spots.
This is not a pure operational or monitoring role; it requires the ability to challenge existing setups drive improvements and influence multiple teams. You should be comfortable operating at system level rather than component level understanding end-to-end flows and risks across services.
More specifically:
- oversees maintenance and development of observability scenario owned by Production.
- helps to identify blind spots and triggers component/application owners to eliminate them thanks to additional data feeds or queries
- supports projects and component/application owners to assess and meet the requirements on monitoring/observability solutions and implementation.
- supports Business in their needs to implement Service monitoring.
- master the incident management process
- has proven senior expertise in using observability product especially the Splunk software.
The Senior Observability Engineer (Production Resilience) function belong to the following generic System Engineer function in our internal IT functions classification:
Division: Group Technology Services (GTS)
The core of IT System Management lies in the scanning evaluating selecting adopting and decommissioning of core technologies for the Euroclear IT infrastructure environments. This requires the function holder not only to maintain close relationships with the key technology providers but also to develop a consistent technology strategy and life-cycle management plans backed up with a forward looking hardware and software budget (capital expenditure as well as operating expenditure). Typical activities within the IT System Management job description are:
- Designing developing implementing and supporting IT infrastructure solutions to support business opportunities in alignment with the enterprise architecture direction and standards
- Performing technical planning to select and implement appropriate technology for supporting business solutions at an affordable cost
- Providing adequate level of expertise to support and troubleshoot infrastructure issues in case of problems
- Assessing the compatibility and integration of products/services proposed as standards in order to ensure an integrated architecture across interdependent technologies.
Role and Responsibilities :
- Identify best practices and client performance gaps if any
- Use tools to identify align and change the factors that affect performance stability and teaming
- Responsible for understanding and assessing a clients business operations identifying issues and opportunities and recommending appropriate solutions.
- Ensure proper translation of business requirements into knowledge technology and organization specifications that satisfy the clients strategic and operational objectives
- Develop business cases and performs impact analysis thereby engaging a wide range of stakeholders within business and IT
- Work with business system colleagues to analyze the high-level information requirements and information flows required to support better processes and supports business system colleagues through the development process
- Ensure appropriate security and governance controls are in place with regular recertification
- Have specific expertise and leads knowledge development within a specific discipline or industry
- Stakeholders are on the highest IT level
- Recommend acquisition of new hardware software and/or telecommunications systems
- May perform the team lead activities for the domain as well as ensure the overall development of the team members
- Focus on relationship management innovation and development of business cases which have a large impact on business processes; ensures realisation of business cases after approval
- Give advice on tactical matters
- Translate strategy and policies into improved/changed processes (new) guidelines standards and models
- Build strategy view on one or several related dimension (people products & services cost)
- Ensure that systems are properly operated and maintained and system evolution is successfully implemented
- Identify strategic capabilities needed to create and sustain one or more sources of competitive advantage. These capabilities may come in the form of process information technology organization enablement or knowledge
Required Qualifications :
- 58 years of relevant experience in at least one of the following domains: Production Engineering Site Reliability Engineering (SRE) Infrastructure/Platform Engineering Monitoring/Observability Engineering or Incident/Problem Management in complex enterprise environments
- Proven hands-on experience with observability platforms with strong expertise in Splunk (data onboarding SPL development dashboarding alerting logic use-case design)
- Complexity & problem solving : Handle standard & non-standard situations with low level of uncertainty covering multiple and new domains of expertise
- Decision making : Use sound judgment to take decisions within the operational domain. Fosters the development of new/improved procedures and methods in their own operational domain. Takes decisions/support the decision process of significant importance in the own operational domain.
- Strategic approach and impact : Master the delivery process and continuously looks for improvements to evolve their own domain of responsibility. Drive the medium term evolution of their own domain and looks how actions and decisions can impact on a longer term. Lead the change process at the implementation level within their own domain. Support and participate to the change within the broader framework.
- Stakeholders management : Build network of internal relationships across the organisation to deliver across domains. Use persuasion and influencing skills to get required support. Upon request deals with external stakeholders or regulators on well defined tactical or technical matters. Autonomously deal with tensions and disputes looking for consensus and benefit for the company
- Autonomy and leadership : Work autonomously on a large domain. Work within clearly defined (periodic) targets with full responsibility of delivery. Provide regular feedback to management on objectives achievements. Has full autonomy on the decisions to be taken about a set of products / services / solutions / applications under their own responsibility . Manage a medium team/group in the operational delivery of a sub-part of the business
Please note that this is a permanent position and we do not offer freelance/contract arrangement for the role.
#LI-AK1
Required Experience:
Senior IC
About Company
Why join us Embark on your new adventure at Euroclear, and work at the heart of the global capital markets. We connect over 2,000 financial institutions across the globe. As an open and resilient infrastructure, we contribute to the stability of the financial markets. We help clients ... View more