Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailAt Mollie were on a mission to make payments and money management effortless for every business in Europe. We started 20 years ago when we launched a more direct affordable way for companies to get paid. That provided an alternative to the frustrating overpriced solutions that banks offered at the time. Today we serve more than 250000 businesses across Europe with an all-in-one solution that simplifies payments and money management. And were a 850-strong team of product finance support commerce and engineering specialists working across Europe from Lisbon to London.
Your Opportunity
As Mollie continues to scale ensuring high availability and managing operational issues efficiently is critical. We are looking for an Incident Response Engineer to help monitor our systems communicate effectively with relevant teams and stakeholders and continuously improve our incident response procedures.
The Incident Response Team coordinates incident management efforts ensures proper communication and drives improvements to prevent operational disruptions. We collaborate with engineering product and support teams to triage incidents escalate issues and enforce response procedures.
We work in a fast-paced environment where technical curiosity automation and continuous improvement are essential. While we do not directly resolve incidents we play a critical role in ensuring response efforts run smoothly and work on enhancing monitoring reporting and tooling to improve efficiency.
What Youll Be Doing:
Monitoring and Managing Incidents: Actively monitor for incidents assess their impact using tools like Datadog and Sentry respond appropriately escalate to relevant teams and coordinate internal and external communication through appropriate channels.
Ensuring Incident Response Processes Are Followed: Collaborate with stakeholders across multiple domains to ensure proper incident response procedures are followed. This includes documentation internal and external communication resolution tracking and adherence to industry regulations (DORA PCI DSS ISAE).
Improving Incident Response Processes: Analyze past incidents enhance monitoring coverage and contribute to automation efforts to improve response efficiency. Participate in post-incident reviews to drive continuous improvements.
Enhancing Internal Tools & Automation: Support and improve internal incident management reporting tools such as Datadog and PagerDuty to streamline workflows and optimize response processes.
Collaborating Across Teams: Serve as a bridge between technical and non-technical teams ensuring effective coordination and clear communication during critical incidents.
What Youll Bring:
Scripting Automation and Source Control Experience: Proficiency in Python or other scripting languages to contribute to automation efforts and tooling improvements mentioned prior along with a familiarity with source control tooling such as Git.
Ability to Work Well Under Pressure: Comfortable making decisions and coordinating responses in high-pressure time-sensitive situations.
Strong English Communication Skills: Ability to clearly communicate complex technical issues to both engineering teams and business stakeholders.
Analytical & Troubleshooting Abilities: Strong problem-solving skills to assess impact escalate appropriately and drive improvements in incident handling.
Self-Driven & Proactive Mindset: Ability to work independently take initiative and drive improvements in incident management workflows.
Stakeholder Management Skills: You have demonstrated the ability to collaborate effectively with multi-functional teams across all levels of the organisation
Nice to Have:
Experience with Monitoring & Observability Tools: Familiarity with monitoring and logging tools used for diagnosing system performance and availability issues such as Datadog and Sentry.
Incident Response & Coordination Experience: Experience in a role involving incident response technical troubleshooting or managing operational issues.
API Experience: Hands-on experience working with APIs - Slack Datadog Pagerduty Looker Atlassian and / or Google APIs - including debugging monitoring and integrating them into workflows.
Experience working in an on-call or incident response rotation.
Familiarity with cloud environments (e.g. GCP) and DevOps principles.
Knowledge of ITIL or incident management frameworks.
Full-Time