drjobs Site Reliability Engineer

Site Reliability Engineer

Employer Active

1 Vacancy
The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Sandy - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Please reference the schedule and minimum qualifications listed below before applying.

If you need assistance with filling out our application form or during any phase of the application interview or employment process please notify our Human Resources Team atoption 1 or email and every reasonable effort will be made to accommodate your needs in a timely manner.

Job Summary

The Site Reliability Engineer is responsible for keeping all memberfacing and internal production systems running smoothly. As an SRE engineer you will work with multiple teams to encourage SRE principles maintain the availability and reliability of systems establish SLIs/SLOs and develop tools and monitoring for operational visibility. SRE engineers are members of the scrum teams and work closely with quality and software engineers to support services prior to general availability through activities such as launch reviews reviewing performance and validating logging in dev environments. Responsible for ensuring quality releases to production environments. The SRE engineer participates in an oncall rotation working with internal and vendor teams to manage troubleshoot and resolve production issues.

Job Description

Location:

9800 S. Monroe Street

Sandy UT 84070

Schedule:

Fulltime Hybrid

To be effective an individual must be able to perform each job duty successfully.

  • Keep current with emerging testing techniques and technologies as well as emerging development practices.
  • Assist in diagnosing finding the root cause reporting and tracking production and nonproduction issues.
  • Continually researching new ways of improving and scaling systems and services.
  • Lead initiatives to improve the reliability scalability and availability of production applications.
  • Build out tools platform and processes to enable these goals.
  • Lead and contribute to design develop and improve SRE practices and procedures.
  • Create and maintain health dashboards identifying and measuring health indicators SLIs/SLOs and providing tools for operational visibility of production systems.
  • Participate in and contribute to improving our incident response acting as an escalation point for production incidents.
  • Perform root cause analysis (RCA) troubleshoot and debug issues across our applications and services to identify and fix root cause.
  • Enhance and maintain the software release procedures and processes.
  • A strong desire and aptitude for system automation to eliminate manual work with daytoday operations
  • Skilled with application monitoring practices and tools (NewRelic Azure Monitor DataDog Splunk etc.)
  • Understanding of and experience with SRE and DevOps principles. Demonstrated experience working in Agile teams leveraging Scrum Kanban or other methodologies and/or understanding of Agile development concepts.
  • Meets the needs of the end user in a quality consistent and professional manner using independent judgment where appropriate.
  • Mentors less experienced engineers.
  • Excellent communication skills (verbal and written) are critical along with exceptional problemsolving skills and exceptionally professional behavior when interacting and responding with other technical teams throughout the organization.
  • Take part in an oncall rotation.
  • Performs additional duties and responsibilities as assigned.

KNOWLEDGE SKILLS & ABILITIES

The requirements listed are representative of the knowledge skills and/or abilities required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential job functions.

EXPERIENCE

  • Minimum 4 years of professional experience in site reliability engineering software development or systems administration
  • Experience monitoring or troubleshooting web applications
  • Experience with Scrum and associated tools such as Azure DevOps or Jira
  • Experience with some of the following tool sets:
    • Application monitoring tools (New Relic DataDog Splunk etc.)
    • Automation tools (Pega Microsoft Power Platform Logic Apps etc.)
    • API tools (Rest# Postman Swagger etc.)
    • Front end tools (Selenium Page Object Model etc.)
    • Backend tools (SQL Server Entity Framework Dapper etc.)
    • Build tools (Node Docker Azure Pipelines etc.)
    • Infrastructure as Code (Terraform Ansible Chef etc.)
  • Experience with automating monitoring andor alerting on some of the following:
    • Web applications in Angular and React
    • Internal support tools
    • 3rd party integrations
    • Database and API connections (Rest and SOAP)
    • Cloud Solutions (AWS Azure or others)
  • Experience working in an agile CI/CD or rapid software testing environment.
  • Experience understanding of Git and source control concepts.

EDUCATION

Education must be from an accredited institution. Education will be verified.

  • Bachelors Degree in computer science computer information systems management information systems or related technical field or equivalent experience.

MANAGERIAL RESPONSIBILITY

Has no supervisory/managerial responsibilities. May provide coaching and/or mentoring to others on the team.

OTHER SKILLS & ABILITIES

  • Demonstrated proficient skills with Microsoft Office Suite including Outlook Word PowerPoint and Excel.
  • Ability to work both autonomously and collaboratively in a fastpaced environment.
  • Selfstarter with strong organizing and time management skills.
  • Adaptive to change responds positively to altered circumstances or conditions.
  • Possess a desire and willingness to learn and continually update knowledge base on financial concepts strategies systems etc.
  • Take initiative to be a problem solver and provide suggestions to improve processes and efficiencies.
  • Excellent interpersonal skills including the ability to collaborate with other teams as needed.
  • Data analytics and data validation skills.
  • Demonstrated ability to clearly express ideas methodology results and recommendations verbally in writing and through insightful reports and graphic illustrations.

PHYSICAL ABILITIES / WORKING CONDITIONS

  • Ability to sit talk and hear consistently
  • Ability to stand walk and use hands to handle or reach occasionally
  • Close vision (clear vision at 20 inches or less)
  • Distance vision (clear vision at 20 feet or more)
  • Ability to lift up to 25 pounds occasionally may need to lift up to 50 pounds.

ENVIRONMENTAL

There are no unusual environmental factors. Work is conducted in typical office setting with moderate noise (e.g. business office with computers and printers light traffic).

#LIFB1

Mountain America Credit Union is an EEO/AA/ADA/Veterans employer.

Employment Type

Full-Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.