Site Reliability Engineer II
Austin, TX - USA
Job Summary
Restaurant365 is a SaaS company disrupting the restaurant industry! Our cloud-based platform provides aunique centralized solution for accounting and back-office operations for restaurants. Restaurant365s culture is focused on empowering team members to produce top-notch results while elevating their skills. Were constantly evolving and improving to make sure we are and always will be Best in Class ... and we want that for you too!
This role requires a hybrid work schedule based out of one of our office locations: Austin TX; Irvine CA; or Akron OH.
TheSite Reliability Engineer IIwillbe responsible forsupporting enhancing and maintaining Restaurant365s cloud infrastructure and applications. Qualified candidates willdemonstrategrowingexpertisein site reliability practices with skills in incident response system monitoring automation and performance troubleshooting. You will collaborate with DevOps development and infrastructure teams to resolve moderately complex issues propose improvements and strengthen the reliability scalability and security of our SaaS platform.
How youll add value:
- Execution & Collaboration
- Respond to production incidents perform triage and troubleshooting and contribute to post-incident analysis.
- Identifyand automate manual processes to improve efficiency and reduce risk.
- Enhance and evolve monitoring tools and platforms to improve observability.
- Promote and apply best practices for reliability scalability and performance across engineering.
- Implement and support cloud automation using Terraform Ansible or CloudFormation.
- Work within change management protocols to providemaximumuptime for production systems.
- Participate in on-call rotation providing 24x7 support for incidents and contributing to root cause analysis.
- Partner with developers architects vendors and IT teams to ensure reliable system operations.
- Research and remediate vulnerabilities in coordination with security teams.
- Maintain documentation of infrastructure monitoring runbooks and incident response procedures.
- Standards & Process
- Apply company policies and procedures when handling operational tasks and incidents.
- Suggest and implement improvements to operational processes and monitoring practices.
- Contribute to technical diagrams documentation and runbooks for system reliability.
- Learning & Growth
- Expandexpertisein cloud services (Azure AWS or GCP) and container platforms (EKS ECS AKS).
- Buildproficiencywith observability and monitoring tools (Prometheus Grafana ELK Site24x7 Nagios).
- Develop scripting and automation skills using Python Bash PowerShell or similar.
- Participate in planning discussions by contributing technical input on system stability and reliability.
What youll need to be successful in this role:
- BS in Computer Science Information Systems or related field (or equivalent experience).
- 24 years of experience in site reliability engineering DevOps or cloud operations.
- Experience with cloud platforms (Azure or AWS) including services such as AKS ECS/EKS Functions/Lambda S3 and Blob storage.
- Proficiencywith infrastructure-as-code and automation (Terraform Ansible YAML Python Bash PowerShell).
- Strong Linux engineering skills; working knowledge of Windows administration.
- Experience supporting production environments andparticipatingin on-call rotations.
- Familiarity with web servers and middleware (Nginx Apache Tomcat).
- Experience with CI/CD tools (GitLab Git or similar).
- Strong written oral and interpersonal communication skills.
Preferred Qualifications
- Experience with monitoring tools (Prometheus Grafana ELK Site24x7 Nagios).
- Knowledge of performance analysis and system vulnerability remediation.
- Cloud certification (AWS or Azure)preferred.
- Familiarity with restaurant industry SaaS platforms and customer-facing applications.
R365 Team Member Benefits & Compensation
- This position has a salary range of $98583-$138016 annually. The above range represents the expected salary range for this position. The actual salary may vary based upon several factors including but not limited to relevant skills/experience time in the role business line and geographic location. Restaurant365 focuses on equitable pay for our team and aims for transparency with our pay practices.
- Comprehensive medical benefits 100% paid for employee
- 401k matching
- Equity Option Grant
- Unlimited PTO Company holidays
- Wellness initiatives
#BI-Remote
DYN365 Inc d/b/a Restaurant365 is an equal opportunity employer.
Required Experience:
IC
About Company
R365's cloud-based restaurant management software helps leaders master accounting, operations, & workforce to create incredible moments that drive profits.