drjobs Manager, DevOps Engineering

Manager, DevOps Engineering

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Toronto - Canada

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Location of Job: Remote- Canada

You will work with engineering and product leadership to craft the long-term roadmap and quarterly deliverables for areas that your team works on ensuring that our products continue to stand out. You will be responsible for tactical execution by the team helping to coordinate activities such as sprint planning design reviews and project planning 

We are the result of multiple acquisitions giving us the unique opportunity to manage and integrate a diverse set of platforms. To improve efficiency and reduce complexity were actively consolidating these platforms into a unified infrastructure. This means were building a scalable flexible foundation that can support the needs of multiple products.  

Were seeking aManager DevOps Engineering to lead a team dedicated to building and maintaining the core infrastructure that powers our entire platform. In this role youll drive initiatives that streamline development workflows eliminate friction and enable our engineering teams to move faster with confidence. If youre passionate about creating scalable developer-friendly platforms that boost productivity and remove bottlenecks through smart automation and thoughtful design this could be the perfect opportunity for you. 

What You Will Do 

  • Lead Developer Experience and Cloud Infrastructure: Manage a team focused on improving the end-to-end experience for developersfrom local development to productionwhile ensuring a stable and scalable cloud-native platform. 

  • Build Developer-Centric Platforms: Design and evolve internal platforms CI/CD pipelines and tooling that empower product engineers to ship features faster safely and with confidence. 

  • Foster Developer-Centric Product Thinking: Gather developer feedback measure platform adoption metrics and iterate based on user needs to ensure your platforms truly serve internal customers. 

  • Drive AI-First Infrastructure Innovation: Lead AI-first approaches to infrastructure automation predictive scaling intelligent alerting and self-healing systems to reduce toil and improve reliability. 

  • Champion Infrastructure Automation: Drive complete automation of infrastructure provisioning configuration deployment and monitoring to enable repeatability self-service and reliability. 

  • Implement SRE Best Practices: Introduce and uphold Site Reliability Engineering principles including error budgets incident response and toil reduction to improve system resilience and uptime. 

  • Lead Platform Consolidation: Spearhead platform consolidation efforts resulting from multiple acquisitions managing technical debt and legacy system migrations while maintaining operational excellence. 

  • Enhance Observability and Reliability: Increase system observability through comprehensive metrics tracing and alerting to enable rapid detection and resolution of production issues. 

  • Collaborate Across Engineering: Partner closely with application developers QA security and product teams to ensure infrastructure and developer platforms meet evolving business needs. 

  • Improve SLAs and Operational Excellence: Continuously improve service level indicators (SLIs) objectives (SLOs) and agreements (SLAs) across systems to meet or exceed uptime goals. 

  • Manage a Distributed High-Performing Team: Lead a globally distributed team of infrastructure DevEx and SRE engineers; recruit mentor and grow the team to meet technical and leadership goals. 

  • Foster a Culture of DevOps and Ownership: Promote ownership and self-service among development teams by building tools and platforms that support full lifecycle responsibility. 

Technologies We Use 

  • Cloud & Infrastructure: AWS Docker Kubernetes (EKS) Terraform Vault 

  • Observability & Reliability: Prometheus Grafana New Relic PagerDuty 

  • CI/CD & DevEx: GitHub GitHub Actions GitHub Packages Atlantis ArgoCD 

  • Languages & Data: Java Python PostgreSQL MongoDB RabbitMQ Elasticsearch Event Store 


Qualifications :

  • 10 yearsof professional experience in infrastructure SRE or platform engineering with2 yearsin a people management or technical leadership role 

  • Bachelors degreein Computer Science Engineering or a related technical field (or equivalent practical experience) 

  • Proven agile leadership experience including mentoring and coaching high-performing teams with demonstrated ability to recruit and develop top engineering talent  

  • Strong ability to lead sprint planning manage execution roadmaps and drive cross-functional collaboration with excellent communication skills across technical and non-technical audiences 

  • Experience treating internal platforms as products gathering developer feedback and applying product management principles to infrastructure services 

  • Experience integrating disparate systems and platforms ideally in post-acquisition environments or complex legacy migration scenarios 

  • Proven track record of buildinginternal platformsCI/CD systems anddeveloper toolingthat improve engineering productivity 

  • Deep understanding ofcloud-native systemscontainer orchestration(e.g. Kubernetes) andInfrastructure as Code (IaC) 

  • Hands-on experience migrating legacy infrastructure tomodern cloud environments(e.g. AWS Kubernetes) 

  • Strong operational expertise inhigh-availability systemsdistributed architectures andincident response 

  • Security-first mindset with working knowledge ofSOC 2PCI DSS and other compliance standards 

  • Experience or strong interest in leveraging AI/ML for infrastructure optimization automated troubleshooting predictive operations or self-healing systems 

  • Proficient inLinux system administrationand comfortable working at the command line 

  • Deep familiarity withSRE practicesincluding SLIs SLOs error budgets and reliability reviews 

  • Experience withDevOps and automation toolssuch as Vault Terraform Atlantis GitHub Actions and Kubernetes 

  • Strongscripting skills especially in Bash (Python or Go experience is a plus) 

  • Ability to leadsprint planning manage execution roadmaps and drive cross-functional collaboration 

  • Passion fordeveloper experience with attention to usability performance and documentation 

  • Excellentcommunication skills both verbal and written across technical and non-technical audiences 

  • Willingness to participate inafter-hours supportfor critical incidents as needed 

  • Strong background insystems operationsreliability engineering andcloud infrastructure best practices 

  • Track record ofhiring and developing top engineering talentand building high-impact teams 

 

Bonus Points 

  • Experience managing large-scale deployments ofPostgreSQL MongoDB RabbitMQ orElasticsearch 

  • Background in operatinghosted or hybrid data center environments 

  • Familiarity withWindows Server infrastructure especially in hybrid cloud setups 

  • Exposure totime-series databasesorevent-sourced system architectures 

  • Advanced experience withAI/ML-powered observability automation orself-healing infrastructure 

  • Active participation inincident retrospectives with a track record of drivingsystemic improvements 


Additional Information :

Other Duties - Please note this job description is not designed to cover or contain a comprehensive listing of activities duties or responsibilities that are required of the employee for this job. Duties responsibilities and activities may change at any time with or without notice.

EEO STATEMENT - Wiser Solutions Inc. is an Equal Opportunity Employer and prohibits Discrimination Harassment and Retaliation of any kind. Wiser Solutions Inc. is committed to the principle of equal employment opportunity for all employees and applicants providing a work environment free of discrimination harassment and retaliation. All employment decisions at Wiser Solutions Inc. are based on business needs job requirements and individual qualifications without regard to race color religion sex national origin family or parental status disability genetics age sexual orientation veteran status or any other status protected by the state federal or local law. Wiser Solutions Inc. will not tolerate discrimination harassment or retaliation based on any of these characteristics. 

Base pay is one part of our total compensation package. Pay is established on an individual basis after considering multiple factors such as relevant experience education and other qualifications. In addition we take into account geographical differentials and make sure pay is equitable with our current staff. For this position our hiring range for base annual pay is estimated to be CAD$175000 to $200000 at the time of this posting. 

Performance-based discretionary bonuses and variable pay plans are available for some positions. 

If you require accommodation to complete any part of the application process or need an alternative manner to apply please contact us at or call .  

#LI-Remote


Remote Work :

No


Employment Type :

Full-time

Employment Type

Full-time

Department / Functional Area

Engineering

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.