Site Reliability Engineer India- Hybrid

Not Interested
Bookmark
Report This Job

profile Job Location:

Bengaluru - India

profile Monthly Salary: Not Disclosed
Posted on: 30+ days ago
Vacancies: 1 Vacancy

Job Summary

Detailed JD (Roles and Responsibilities)

The Site Reliability Engineer II is responsible for providing continuous feedback of site health reliability availability and user experience to both engineering and product owners. Real-time measurements for production environments will be collected aggregated analyzed using both infrastructure and APM tools including but not limited to SolarWinds Dynatrace and log addition to monitoring and insight a heavy focus will be placed on automation opportunities and automating operational processes to maintain 99.9% availability of AvidXchange core products.

Performs Production SaaS operational and administration duties to maintain the

health and reliability of SaaS production systems

Performs Production SaaS support incident management problem management

and service restoration as needed to quickly respond to and resolve production

issues

Implements and trains team members on tools for measuring core product health

in production (with opportunities to extend those capabilities all the way back

through the entire DevOps pipeline)

Implements and trains team members for calculating system availability SLAs

across AvidXchange products

JOB OVERVIEW

The Site Reliability Engineer is responsible for providing continuous feedback of site

health reliability availability and user experience to both engineering and product

owners. Real-time measurements for production environments will be collected

aggregated analyzed using both infrastructure and APM tools including but not limited to

SolarWinds Dynatrace and log addition to monitoring and insight a heavy

focus will be placed on automation opportunities and automating operational processes to

maintain 99.9% availability of AvidXchange core products.

Implements and executes the tool consolidation strategy to optimize spend versus value for our end to end monitoring platform Implements rapid and continuous development and application of automated solutions to address reliability issues and automate manual tasks Works with the Software DevOps team to implement DevOps CICD continuous performance testing monitoring and reliability strategy using Visual Studio Team Services and other cloud-based tools Implements the measurement capability of core product availability across Azure and AvidXchange Cloud using HTTP endpoint testing and synthetic user testing Maintain automated site availability reporting and data platform Gathers data for usability reliability incident and user experience of AvidXchange products for consumption by executive leadership on a weekly basis Influences product delivery teams to implement usability and reliability enhancements leading to improved user experience index scores and improved availability Provides detailed analysis and troubleshooting for systems outages providing feedback to product / software engineering

Candidate also is required and willing to work in an on-call rotation schedule. This happens every 2.5 months and when its their turn to be on-call its 24x7 for 2 weeks.

Total Experience

5 total experience

Relevant Experience

3 years Relevant experience

Mandatory skills

Site Reliability Engineer

APM tools including but not limited to

SolarWinds Dynatrace and log analytics.

Detailed JD (Roles and Responsibilities) The Site Reliability Engineer II is responsible for providing continuous feedback of site health reliability availability and user experience to both engineering and product owners. Real-time measurements for production environments will be collected...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting