Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailLife at MX
We are driven by our moral imperative to advance mankind - and it all starts with our people product and purpose. We always carry a deep sense of drive and passion with us. If you thrive in a challenging work environment surrounded by incredible team members who will help you grow MX is the right place for you.
Come build with us and be part of an award-winning company thats helping create meaningful and lasting change in the financial industry.
Mission Alignment: Drive MX reliability via prevention engineering AI automation and SRE excellence
Key Responsibilities
Incident Response (Hands-On): Triage and mitigate production incidents using observability for Golden Signals (latency traffic errors saturation). Identify root issues in MX architecture (e.g. database sharding webhook failures) escalating just-in-time under unified command to minimize MTTR <1h.
Incident Response: Drive a coordinated response to an active incident ensuring that communication is clear roles are assigned and the incident is mitigated efficiently and safely
Analysis & Pattern Recognition: Analyze incidents postmortems and architecture (e.g. RMQ overflows) to identify recurring issues. Develop knowledge bases and AI-driven early warning systems for proactive prevention.
Prevention Engineering: Identify incident trends and drive projects to prevent repeated incidents cross-org amongst multiple engineering teams. Identify and implement mitigation mechanisms improve observability alerting and runbooks.
Perform on-call rotations; enforce priority matrix; lead RCAs with client-readable legal-approved outputs.
Cross-functional collaboration to enhance observability (e.g. black-box probes) and reduce technical debt.
Required Qualifications
7 years SRE/DevOps; 1 in AI/ML for reliability (e.g. predictive analytics on incident data).
Proficient in Datadog Kubernetes Python/Javascript for AI/automation.
Experience reducing MTTR/repeats in distributed systems; familiar with SLOs/error budgets.
BS in CS or equivalent experience; Americas on-call availability.
Preferred Skills
Datadog/similar experience
Fintech experience with MX-like architectures
Google SRE practices: toil elimination incident management automation for self-healing.
Proficiency in JavaScript for frontend observability/tools.
Golang Ruby on Rails experience
At MX we are a high-performance organization that thrives on trust and results. This role is based in Lehi Utah with flexibility for both in-office and remote work. We believe in empowering our team members to deliver exceptional outcomes while taking advantage of our incredible office space when it best supports their work. Our Utah office features onsite perks such as company-paid meals massage therapists a sports simulator gym mothers lounge and meditation room and meaningful interactions with amazing people. We encourage team members to come together in the office to collaborate kick off key projects or strategize cross-functionally fostering connection and innovation.
MX is proudly committed to recruiting and retaining a diverse and inclusive workforce. As an Equal Opportunity Employer we never discriminate based on race religion color national origin gender (including pregnancy childbirth or related medical conditions) sexual orientation gender identity gender expression age military or veteran status status as an individual with a disability or other applicable legally protected characteristics. We particularly welcome applications from veterans and military spouses. All your information will be kept confidential according to EEO guidelines. You may request reasonable accommodations by sending an email to
Required Experience:
Senior IC
Full-Time