Senior Site Reliability Engineer – Automation & Observability

Tech Talent International

Job Location:

Montreal - Canada

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Tech Talent International (SI) supplies technical talent to a variety of clients ranging from Fortune 100/500/1000 companies to small and mid-sized organizations in Canada/US and Europe.

We currently have a role as aSenior Site Reliability Engineer (SRE) Automation & Observability with our large consulting client working onsite at a major financial services client in the downtown Montreal area

Role: Cybersecurity - Senior Site Reliability Engineer (SRE) Automation & Observability

Type: Permanent or Contract 40 hrs/week

Location: Hybrid - Downtown Montreal QC -(roles starts off 5 days in office for 1st 3 months then turns into hybrid setup 3 days onsite 2 days from home)

Salary: $110000 - $% bonus 3-5 weeks paid vacation RRSP contribution benefits sick/personal days

Position Overview

The Automation team consists of several Subject Matter Experts (SMEs) who assist the Global Process Owner in designing building and maintaining the organizations IT services. While leading the companys IT services team the IT Service Manager strives to develop reliable IT services and improve the organizations existing IT service infrastructure.

IT Service Managers are responsible for maintaining a high standard of service delivery while managing the organizations IT services and anticipating and resolving issues that may arise within company systems or client environments. These services include infrastructure monitoring task automation server asset management and network inventory management.

Change incident problem and request management along with CMDB (Configuration Management Database) functions are core services widely used throughout CIB IT. The ITSM team serves as the bridge between IT and business stakeholders ensuring coordination and predictability for CIB IT and its business operations.

The team includes SMEs focused on key service areas as directed by management with the objective of delivering high-quality services through various platforms that maximize efficiency and consistent results.

Within the Automation & Observability organization the Production Smart Automation team provides production support services for the Analytics Consulting and Digital Assets IT clusters. This includes both functional and technical support as well as project delivery for production and non-production platforms. The team operates globally and consists of approximately 10 members located in Paris Warsaw Mumbai and Montreal.

Key Responsibilities

The Site Reliability Engineer (SRE) will be part of a multidisciplinary team providing Level 1 and Level 2 technical and project support. This is a production-focused role requiring a broad range of technical expertise.

The SRE will work closely with development and infrastructure teams to:

Monitor manage and proactively improve the availability and performance of production environments from presentation and application layers through infrastructure layers.
Plan and implement application deployments load testing activities and configuration changes.
Ensure production environments are operational and available while collaborating with teams to understand user needs.
Contribute to medium- and large-scale technical projects including architecture reviews solution design application upgrades and migrations to new platforms.
Collaborate on prioritized tasks while providing regular status updates and maintaining focus on target solutions.
Understand delivery lifecycle phases to ensure work is completed according to defined specifications and timelines.
Identify opportunities to improve operational efficiency and contribute to automation initiatives.
Provide constructive feedback and recommendations to management regarding performance capacity and system design.
Assist in documenting architectures and designs as well as distributing meeting minutes and action items.

The SRE will also work with other teams to respond to incidents and resolve issues quickly often under pressure in order to restore normal business services. As a result participation in on-call rotations and after-hours support may be required.

Candidates should possess both the aptitude and desire to learn new technologies and contribute innovative ideas that may benefit the department.

Requirements

Candidates should have:

57 years of experience in a similar role.
Experience providing multidisciplinary technical support within a team environment.
Practical knowledge of performance and capacity management across:
- Applications
- Databases
- Networks
Strong automation skills and mindset.

Skills & Competencies

Systems Administration

Strong Linux/Unix administration skills
Good knowledge of Windows environments

Containerization & Cloud

Strong knowledge of Docker and Kubernetes
Understanding of cloud-based platforms and solutions

Infrastructure & Networking

Good understanding of enterprise infrastructure firewalls and networking concepts
Knowledge of load-balancing technologies
Strong understanding of networking fundamentals

Security

Experience with APIs
Familiarity with CyberArk or HashiCorp Vault

Databases

Experience with SQL Server
Experience with Oracle
Exposure to NoSQL databases

Monitoring & Observability

Experience configuring application monitoring tools such as Dynatrace

DevOps & CI/CD

Experience with:

Jenkins
Bitbucket
Artifactory
Ansible
ArgoCD

Development & Automation

Knowledge of software development and scripting methodologies
Demonstrated programming ability in languages such as Python

IT Service Management

Good understanding of ITIL processes
Understanding of user and server authentication mechanisms that enable automated deployment cycles while maintaining strong security controls

Personal Attributes

Strong problem-solving abilities
Team-oriented mindset
Customer-focused approach

Tech Talent International (SI) supplies technical talent to a variety of clients ranging from Fortune 100/500/1000 companies to small and mid-sized organizations in Canada/US and Europe. We currently have a role as aSenior Site Reliability Engineer (SRE) Automation & Observability with our large co...

Tech Talent International (SI) supplies technical talent to a variety of clients ranging from Fortune 100/500/1000 companies to small and mid-sized organizations in Canada/US and Europe.

Role: Cybersecurity - Senior Site Reliability Engineer (SRE) Automation & Observability

Type: Permanent or Contract 40 hrs/week

Location: Hybrid - Downtown Montreal QC -(roles starts off 5 days in office for 1st 3 months then turns into hybrid setup 3 days onsite 2 days from home)

Salary: $110000 - $% bonus 3-5 weeks paid vacation RRSP contribution benefits sick/personal days

Position Overview

Key Responsibilities

The SRE will work closely with development and infrastructure teams to:

Monitor manage and proactively improve the availability and performance of production environments from presentation and application layers through infrastructure layers.
Plan and implement application deployments load testing activities and configuration changes.
Ensure production environments are operational and available while collaborating with teams to understand user needs.
Contribute to medium- and large-scale technical projects including architecture reviews solution design application upgrades and migrations to new platforms.
Collaborate on prioritized tasks while providing regular status updates and maintaining focus on target solutions.
Understand delivery lifecycle phases to ensure work is completed according to defined specifications and timelines.
Identify opportunities to improve operational efficiency and contribute to automation initiatives.
Provide constructive feedback and recommendations to management regarding performance capacity and system design.
Assist in documenting architectures and designs as well as distributing meeting minutes and action items.

Candidates should possess both the aptitude and desire to learn new technologies and contribute innovative ideas that may benefit the department.

Requirements

Candidates should have:

57 years of experience in a similar role.
Experience providing multidisciplinary technical support within a team environment.
Practical knowledge of performance and capacity management across:
- Applications
- Databases
- Networks
Strong automation skills and mindset.

Skills & Competencies

Systems Administration

Strong Linux/Unix administration skills
Good knowledge of Windows environments

Containerization & Cloud

Strong knowledge of Docker and Kubernetes
Understanding of cloud-based platforms and solutions

Infrastructure & Networking

Good understanding of enterprise infrastructure firewalls and networking concepts
Knowledge of load-balancing technologies
Strong understanding of networking fundamentals

Security

Experience with APIs
Familiarity with CyberArk or HashiCorp Vault

Databases

Experience with SQL Server
Experience with Oracle
Exposure to NoSQL databases

Monitoring & Observability

Experience configuring application monitoring tools such as Dynatrace

DevOps & CI/CD

Experience with:

Jenkins
Bitbucket
Artifactory
Ansible
ArgoCD

Development & Automation

Knowledge of software development and scripting methodologies
Demonstrated programming ability in languages such as Python

IT Service Management

Good understanding of ITIL processes
Understanding of user and server authentication mechanisms that enable automated deployment cycles while maintaining strong security controls

Personal Attributes

Strong problem-solving abilities
Team-oriented mindset
Customer-focused approach

Apply Now

About Company

Tech Talent International

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click

AI Resume Builder

Create an ATS-ready CV in minutes

AI Cover Letter

Write a personalized letter instantly

Senior Site Reliability Engineer – Automation & Observability

Montreal - Canada

Job Summary

Key Responsibilities

Requirements

Skills & Competencies

Systems Administration

Containerization & Cloud

Infrastructure & Networking

Security

Databases

Monitoring & Observability

DevOps & CI/CD

Development & Automation

IT Service Management

Personal Attributes

Key Responsibilities

Requirements

Skills & Competencies

Systems Administration

Containerization & Cloud

Infrastructure & Networking

Security

Databases

Monitoring & Observability

DevOps & CI/CD

Development & Automation

IT Service Management

Personal Attributes

About Company

Related Jobs