Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailWood Mackenzie is the global data and analytics business for the renewables energy and natural resources industries. Enhanced by technology. Enriched by human intelligence. In an ever-changing world companies and governments need reliable and actionable insight to lead the transition to a sustainable future. Thats why we cover the entire supply chain with unparalleled breadth and depth backed by over 50 years experience. Our team of over 2400 experts operating across 30 global locations are enabling customers decisions through real-time analytics consultancy events and thought leadership. Together we deliver the insight they need to separate risk from opportunity and make confident decisions when it matters most.
Wood Mackenzie Values
Site Reliability Engineer II
Job Description
Wood Mackenzie has an exciting opportunity for aSite Reliability Engineer (SRE) IIto join a dynamic global business to help drive change and innovation. We are looking for an SRE professional to help us manage and support our products and services within the enterprise.
Role Purpose
The principal responsibility of this role is to provide operational expertise within the SRE team and work with the software engineering teams for releasing and maintaining new and existing applications. This encompasses:
Working in partnership with the business and the technology teams bringing awareness and insight of the different operational constraints / opportunities for projects targeting cloud-based or on-premises deployment.
Advanced implementation and maintenance of more complex cloud and on-prem resources and environments.
Promotion of mutual feedback in cross-functional groups following SRE best practices within a devops culture.
Implementation of advanced continuous integration/delivery toolsets or the processes resources and platforms that use those tools.
Strong focus on service availability and proactive detection of problems.
Ability to articulate technical and business concepts to different audiences and be able to influence technical decisions with solid metrics collection and proof of concepts
Responsibilities:
Advanced Pipeline Implementation and Optimization: Work closely with cross-functional teams including developers QA and product managers to develop implement and maintain advanced delivery pipelines for efficiency and scalability using tools like Jenkins TeamCity Octopus Deploy and GitHub Actions.
Operational Insights: Provide advanced solutions for operational excellence through identifying operational constraints and opportunities such as auto-scaling container orchestration and system resiliency.
Proactive Monitoring: Develop proactive monitoring solutions to predict and mitigate potential issues. Regularly analyze monitoring data to identify trends and areas for improvement.
Tooling Innovation: Drive innovation in team tooling and processes. Continuously evaluating fit and purpose of industry tools used across the team.
Operational Leadership: Mentor Level 1 engineers and lead operational improvements.
Continuous Improvement: Embrace a mindset of continuous improvement regularly reviewing operational processes to identify inefficiencies. Assist with drafting and proposing actionable plans for process enhancements to increase team efficiency and system reliability.
Incident Management: Actively leading incident response efforts (P3 and P4) and conduct post-incident reviews. Be actively involved in troubleshooting and collaboration of active incidents.
Documentation and Knowledge Sharing: Ensure thorough documentation including adding or changing where necessary and fostering a culture of knowledge sharing.
On-Call Rotation: Participate in a 24/7 on-call support rotation providing advanced support responding to system alerts and incidents to ensure continuous system availability and performance.
Qualifications
We understand every organization is different and professionals have their own unique history and experience so we dont expect to find a 100% match of candidate competencies in respect of the tech stack we use in Wood Mackenzie. We list our preferred technologies but if you have transferrable knowledge and you are willing to learn what you do not know we will consider your application.
Skill Requirements:
Experience: Minimum of 2-4 years in SRE/DevOps roles.
Leadership in Agile: Experience leading agile processes supporting multiple software engineering teams for product releases and providing monitoring and insights into issues.
Advanced Cloud Skills: Strong Amazon Web Services (AWS) understanding (Cloud Practitioner or equivalent knowledge) working knowledge of Azure (AZ-900 or equivalent knowledge).
DevOps Mindset: Strong understanding of DevOps principles and operational model including continuous integration continuous delivery and infrastructure as code. Experience with agile methodologies such as Kanban or Scrum and familiarity with JIRA for issue tracking.
Team Collaboration: Demonstrated ability to work independently as well as part of a cross-functional multi-locational team. Effective communication skills to collaborate with team members and stakeholders.
Resource Maintenance: Develop and oversee routine system maintenance tasks including patch management system backups and performance tuning.
Advanced Linux Skills: Proficient in Linux administration (RHEL/Ubuntu) and automation.
Automation Expertise: Expertise with configuration management tools (e.g. Ansible SaltStack) for automating system configurations and deployments.
Mentorship: Ability to mentor SRE Is.
Additional Preferred Skills:
Advanced Cloud Proficiency: Extensive AWS and Azure experience.
Containerization Expertise: Proficiency in Docker and Kubernetes.
Advanced Scripting and Automation: Strong scripting skills in Python Bash or PowerShell.
Infrastructure as Code: Experience with Terraform CloudFormation or Pulumi.
CI/CD Pipelines: Deep understanding of CI/CD principles.
Monitoring and Logging: Proficiency with using tools like Prometheus Nagios Grafana ELK Stack Splunk App Insights and CloudWatch.
Networking Knowledge: Basic understanding of networking concepts.
Security Best Practices: Familiarity with security best practices and compliance frameworks (e.g. SOX SOC II NIST).
Database Management: Experience with SQL and NoSQL databases.
Enterprise SaaS applications: Experience working with SaaS applications such as Okta Jira and Confluence.
Collaboration Tools: Experience with Git GitHub and documentation platforms like Confluence.
Equal Opportunities
We are an equal opportunities employer. This means we are committed to recruiting the best people regardless of their race colour religion age sex national origin disability or protected veteran status. You can find out more about your rights under the law at
If you are applying for a role and have a physical or mental disability we will support you with your application or through the hiring process.
Full-Time