Site Reliability Engineer

TMS


Job Location:

Dallas, TX - USA

Monthly Salary: Not Disclosed
Posted on: 5 hours ago
Vacancies: 1 Vacancy

Job Summary

Role: Site Reliability Engineer
Duration: 6 Months Contract
Location: Dallas TX- Hybrid

Job Description (SRE)

Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLOs and averting incidents altogether when possible.

Collaborating with the customers to understand their pain points around Supportability and SLO attainment and formulate strategies for addressing recurring issues in a sustainable way.

Communicate on a deeply technical level and be the single point of contact for interfacing with large enterprise customers for handling service escalations and driving the issues to resolution.

Ability to design and implement any changes to service telemetry for the automation to consume if it is not already available.

Enhancing customer facing experience by proactive alerting based on utilization trends resource health etc.

Analyze data and provide operational insights into customer experience to Design and Product teams so that we can design features with Supportability in mind.

Experience & Skills

Overall 8 with At least 4-6 years of experience with Site Reliability Engineering in Microsoft platform

Run the production environment by monitoring availability and taking a holistic view of system health

Build software and systems to manage platform infrastructure and applications

Improve reliability quality and time-to-market of our suite of software solutions

Measure and optimize system performance with an eye toward pushing our capabilities forward getting ahead of customer needs and innovating for continual improvement

Provide primary operational support and engineering for multiple large-scale distributed software applications

Ability to program (structured and OOP) using one or more high-level languages such as Python C# and JavaScript

Good knowledge of Azure cloud environment and Azure services.

Proactive approach to identifying problems performance bottlenecks and areas for improvement

Preferred skills and qualifications

Previous success in technical engineering

Coding experience beyond simple scripts

Role: Site Reliability Engineer Duration: 6 Months Contract Location: Dallas TX- Hybrid Job Description (SRE) Collaborating closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLOs and averting incidents altogethe...