Title: Systems Reliability Engineer (SRE)
Location: Montreal QC (Onsite)
Type: Contract
- Systems Reliability Engineering (SRE) is a discipline focused on improving system service availability observability scalability performance and resilience across Client by applying sound software engineering principles adopting the latest technology and tooling. We are growing SRE capabilities within our Reliability & Production Engineering (RPE) organization as part of the transformation of Client Technology.
Your responsibilities will include but not be limited to:
- Working closely with engineering/development teams to design build and maintain systems.
- Troubleshooting issues across the entire technology stack: hardware software application and network.
- Identifying and driving opportunities to improve automation for our platforms; scope and create automation for deployment management and visibility of our services.
- Proactively identifying and addressing systems reliability risks.
- Working alongside existing global and regional team members on a follow-the-sun basis.
- Represent the RPE organization in design reviews and operational readiness exercises for new and existing services.
- The RPE role is required to provide production support services under RPE organization.
- The role as well requires the member to develop automation and tooling to support SRE activities and achieve specific reliability and supportability goals (reduction of toil monitoring and alerting efficiency etc.) for in-scope systems and across the larger org.
The below are the key skill sets to perform day to day work:
- Bachelors degree in computer science or related field.
- Proficiency with Linux.
- Strong experience in Database scripting (stored procedure and compound SQL) and data analysis in Sybase DB2 or Greenplum etc.
- DB monitoring and performance tuning.
- Working experience in Python/Perl/Shell scripting.
- Troubleshooting skills (tracking trends producing metrics and analysis).
- Strong verbal and written skills required to interact with global teams and customers.
- Flexibility of work in shift and perform on-call responsibility
- Working from office (3 days per week minimum is the current policy).
- Good to Have:- Experience in financial service/products investment banking;- Experience in Advanced Monitoring/Alerting Tools (Splunk AppDynamics Elastic Search etc.);-
- Have knowledge on development tools like GIT Jenkins etc.
- Agile/DevOps/SRE mindset and/or tooling;-
- Understanding Cloud technology.