Principal Site Reliability Engineer (SRE)

CGI

Not Interested
Bookmark
Report This Job

profile Job Location:

Toronto - Canada

profile Monthly Salary: $ 90000 - 140000
Posted on: 11 hours ago
Vacancies: 1 Vacancy

Job Summary

Position Description:

Location: Edmonton
Open to other locations within proximity to a CGI Office
Hybrid work model

We are hiring a Senior Site Reliability Engineer (SRE) with a strong foundation in building and operating reliable scalable and resilient cloud platforms. You bring a reliability and performance engineering mindset to everything you dobalancing operational stability with modernization and this role you will apply core SRE practicesincluding SLIs/SLOs observability incident management and operational automationwhile temporarily supporting a regional support strategy engagement focused on assessing and strengthening large-scale operational environments. You will work closely with platform operations and architecture teams to evaluate current-state practices identify reliability and support gaps and contribute to the definition of future-state operating models and implementation roadmaps. Beyond this engagement the role is designed for ongoing hands-on SRE delivery where you will lead and implement monitoring reliability engineering automation and tooling across cloud and hybrid environments. You will collaborate with cross-functional teams to design build and continuously improve platform reliability engineering standards and operational excellence practices for mission-critical services. This position places you in a client-facing high-impact environment where your technical depth operational judgment and ability to translate reliability principles into practical outcomes will directly influence service stability modernization efforts and future cloud initiatives. If you are a proven SRE who thrives in complex environments and values both hands-on engineering and operational leadership this role offers the opportunity to make a meaningful and lasting impact.

Your future duties and responsibilities:

Who are You
You are a senior Site Reliability Engineer who thrives on solving complex reliability and operational challenges at scale. You are curious collaborative and continuously focused on improving how platforms infrastructure and services are operated and supported. Your strength lies in applying sound engineering judgment to real-world operational problems balancing reliability performance and maintainability. You are equally comfortable working hands-on with tools and systems and stepping back to assess how operational practices support models and workflows impact service reliability. You can engage confidently in technical discussions with engineers while also communicating clearly with operational leaders and stakeholders to explain risks trade-offs and improvement opportunities.
With a mindset grounded in continuous improvement and learning you champion modernization automation and pragmatic reliability practices. You are trusted for your ability to identify root causes rather than symptoms to raise concerns early and to translate reliability principles into practical actionable outcomes. Your peers value your technical depth and calm leadership in complex environments and teams rely on you to elevate operational maturity and execution quality. At CGI we recognize strong SRE practitioners and provide the environment and support for them to grow contribute and make a meaningful impact across engagements.


Responsibilities
Develop operate and evolve monitoring logging and alerting capabilities across cloud and hybrid environments while temporarily contributing SRE expertise to assess and rationalize existing operational monitoring practices as part of a regional support strategy initiative.
Define implement and continuously improve SLIs SLOs and SLAs for platform and service reliability applying these principles during the engagement to evaluate current-state service outcomes and inform future-state reliability targets.
Lead and participate in incident response problem investigation and root cause analysis leveraging hands-on SRE experience to identify systemic reliability issues and recurring operational failure patterns observed across regional support operations.
Design and automate reliability and operational processes including integration with CI/CD pipelines and operational workflows while contributing insights into where automation and tooling can reduce manual effort and improve support consistency across regions.
Collaborate closely with DevOps platform engineering architecture and application teams providing SRE leadership during this engagement and transitioning seamlessly to tool- and platform-heavy delivery roles on future projects.
Analyze and document current operational workflows support models and escalation paths translating frontline operational insights into actionable reliability and service improvement recommendations.
Contribute to the definition of future-state operating models and implementation roadmaps by applying SRE and operational excellence principles to improve reliability supportability and scalability.
Provide regular status updates and risk assessments highlighting operational risks dependencies and reliability impacts to support informed decision-making.

Required qualifications to be successful in this role:

5 years of experience in Site Reliability Engineering platform engineering or infrastructure operations with demonstrated ability to apply reliability principles across both delivery and operational contexts.
Strong proficiency with observability and monitoring platforms such as Grafana Prometheus ELK New Relic or equivalent with the ability to assess design and improve monitoring strategies in complex environments.
Hands-on experience operating cloud platforms (Azure AWS and/or GCP) including production support reliability engineering and operational troubleshooting.
Strong automation and scripting skills using tools such as Python Bash Ansible or equivalent with a mindset focused on reducing toil and improving operational efficiency.
Excellent communication skills in English (French considered an asset) with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders.
Proven track record of improving system reliability availability and operational stability including measurable reductions in incident frequency or impact.
Experience analyzing and documenting operational workflows support models and escalation paths within IT or platform operations environments.
Ability to facilitate technical and operational workshops with engineers operations teams and service stakeholders to validate findings and align on improvements.
Working knowledge of ITSM / ITIL practices (Incident Problem Change) particularly as they relate to reliability supportability and operational maturity.
Experience working in regulated enterprise or public-sector environments where documentation quality security classification and auditability are required.

CGI is providing a reasonable estimate of the pay range for this role. The determination of this range includes factors such as skill set level geographic market experience and training and licenses and certifications. Compensation decisions depend on the facts and circumstances of each case. A reasonable estimate of the current range is $90000$140000. This role is a future opportunity.

#LI-AB19

Use of the term engineering in this job posting refers to the technical sense related to Information Technology (IT) and does not imply that the individual practices engineering or possesses the requisite license as prescribed by the applicable provincial or territorial engineering regulator. We are seeking individuals with expertise in IT engineering-related functions but licensure from an engineering regulator is not a prerequisite for this position. Engineering is a regulated profession in Canada which is restricted in terms of use of titles and designation.

Skills:

  • Finance&Ops Apps Solution Arch

What you can expect from us:

Together as owners lets turn meaningful insights into action.

Life at CGI is rooted in ownership teamwork respect and belonging. Here youll reach your full potential because

You are invited to be an owner from day 1 as we work together to bring our Dream to life. Thats why we call ourselves CGI Partners rather than employees. We benefit from our collective success and actively shape our companys strategy and direction.

Your work creates value. Youll develop innovative solutions and build relationships with teammates and clients while accessing global capabilities to scale your ideas embrace new opportunities and benefit from expansive industry and technology expertise.

Youll shape your career by joining a company built to grow and last. Youll be supported by leaders who care about your health and well-being and provide you with opportunities to deepen your skills and broaden your horizons.

At CGI we value the strength that diversity brings and are committed to fostering a workplace where everyone belongs. We collaborate with our clients to build more inclusive communities and empower all CGI partners to thrive. As an equal-opportunity employer being able to perform your best during the recruitment process is important to us. If you require an accommodation please inform your recruiter.

To learn more about accessibility at CGI contact us via email. Please note that this email is strictly for accessibility requests and cannot be used for application status inquiries.

Come join our teamone of the largest IT and business consulting services firms in the world.


Required Experience:

Staff IC

Position Description:Location: Edmonton Open to other locations within proximity to a CGI Office Hybrid work modelWe are hiring a Senior Site Reliability Engineer (SRE) with a strong foundation in building and operating reliable scalable and resilient cloud platforms. You bring a reliability and per...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

The COMPANY is one of the few end-to-end consulting firms with the scale, reach, capabilities and commitment to meet clients’ enterprise digital transformation needs. Our 77,500 consultants and professionals work side-by-side with clients in 10 industries across more than 400 location ... View more

View Profile View Profile