Support Engineer, AWS Incident Response
Seattle, OR - USA
Job Summary
The Role
As a Support Engineer on AIRs Seattle team youll be on the front line of AWS incident response. Youll lead high-severity calls triage complex failures across distributed systems coordinate resolver teams and drive incidents to mitigation while millions of customers depend on the outcome. Between incidents youll obsess over metrics and detection analysis building dashboards and mechanisms that surface problems before customers notice. You will drive operational improvements that make the incident management ecosystem faster and more accurate.
This isnt a role where you watch dashboards and robotically follow runbooks. Youll deep-dive the largest most complex technical environment in the world. Youll develop expertise across AWS services networking and infrastructure. Youll own operational processes end-to-end and use data to find the next leap in how we protect the cloud. If interested youll also have the opportunity to grow your development skills by taking on coding projects matched to your ability level.
This role includes participation in an on-call rotation including some weekends and holidays.
Key job responsibilities
Incident Response
Lead high-severity incident response calls. Triage coordinate resolvers across AWS service teams communicate clearly under pressure and drive incidents to mitigation. Manage escalations and ensure accurate documentation throughout.
Operational Excellence and Detection
Own and run operational health reviews. Build and maintain dashboards metrics and monitoring that surface trends before they become incidents. Obsess over detection accuracy and speed. Detect patterns across events and drive proactive mechanisms to prevent recurrence.
Metrics and Analysis
Deep-dive operational data to identify systemic issues measure response effectiveness and prioritize improvements. Use metrics to tell the story of whats working whats degrading and where the next risk is hiding.
Process and Tooling Improvement
Identify gaps in operational processes documentation and tooling. Build or improve mechanisms that reduce time-to-detection and time-to-mitigation. Use data to prioritize where effort has the highest impact.
Automation and Generative AI
Leverage scripting generative AI and automation to accelerate incident response improve detection and reduce toil. Identify opportunities where AI can augment human judgment during incidents or surface insights from operational data at scale.
Driving Continuous Improvement
Ensure each incident makes AWS stronger. Work with service teams to ensure learnings from incidents drive corrective actions and that follow-through happens. Close the loop between what broke and what gets fixed.
- 2 years of technical support experience
- Direct experience participating in incident response for production systems
- Strong understanding of operating systems (Linux) networking fundamentals and distributed systems
- Experience with operational monitoring alerting and metrics (CloudWatch Datadog Grafana or equivalent)
- Demonstrated ability to troubleshoot complex technical problems spanning multiple systems or services
- Experience scripting or programming in at least one modern language (Python Bash Go or similar)
- Ability to clearly break down technical complexity for a wide range of audiences from engineers to senior leadership without relying on jargon
- Familiarity with incident management tooling and workflows
- Experience with AWS services and cloud infrastructure
- Experience using generative AI or automation to solve operational problems or accelerate workflows
- Track record of authoring post-incident analyses (post-mortems) and driving corrective actions to completion
- Experience building operational dashboards runbooks or automation that improved team efficiency
- Experience coordinating across globally distributed teams and time zones
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience qualifications and location. Amazon also offers comprehensive benefits including health insurance (medical dental vision prescription Basic Life & AD&D insurance and option for Supplemental life plans EAP Mental Health Support Medical Advice Line Flexible Spending Accounts Adoption and Surrogacy Reimbursement coverage) 401(k) matching paid time off and parental leave. Learn more about our benefits at WA Seattle - 90400.00 - 158200.00 USD annually
Required Experience:
IC
About Company
Free shipping on millions of items. Get the best of Shopping and Entertainment with Prime. Enjoy low prices and great deals on the largest selection of everyday essentials and other products, including fashion, home, beauty, electronics, Alexa Devices, sporting goods, toys, automotive ... View more