Position : System Reliability Engineer Production Support
Location : Any Location Frisco/TX Bellavue/WA Atlanta/GA Overland Park/KS
Day 1 Onsite 5 Days/Week
No. Of Openings : 02
Total Years of Experience required: 10 Years
Job Summary
An SRE who leads incidents and manages the bridge until the issue is resolved and tracks the action item to closure including Change Management and problem management.
Responsibilities
- Actively driving incident calls working with Technical Product SMEs and Tier 2 SREs
- Establishing a timeline of the incident progression and Action item followups until closure during or after the call
- Summarizing the discussion into knowledge articles action items and doing warm handoffs to Tier 2 teams
- Being adopters and advocates of BestPractices collated from experience and SMEs like OTel App availability and Resiliency
- Change management and problem management
- Posting updates on AHOD and providing regular updates to leadership
Years of experience needed
- 6 year Strong technical project management background
- Has worked in telecom industry and SRE Ops
- Has experience working in the Digital space and products
- Knowledgeable to leverage tools quickly like Splunk Otel Grafana PowerBI AppD
- Previous TMobile experience is must
Skills:
- Ability to focus on incidents and work with SMEs and Tier 2 Leads
- Attention to details and catching the minutest detail spoken on a call
- Diligence in followup and driving the SRE mandate to every team and partnering with them to operationalize best practices
- Great at detailed communication with ICs and precise succinct communication with leadership
- Attitude to chase down even outlier issues to resolution
Please share valid resume to