drjobs Site reliability engineer

Site reliability engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

London - UK

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

About this role

We are looking for a foundational member of the Cloud infrastructure team at Writer. This role will involve contributing to the development and implementation of our Site reliability engineering (SRE) program. The ideal candidate will ensure the reliability scalability performance and security of Writers critical systems taking a proactive approach to guarantee that our high-ROI products reach our customers seamlessly.


Your responsibilities:

  • Lead the design implementation and maintenance of Writer Inc.s cloud infrastructure to ensure high availability and performance

  • Design and implement scalable cloud automation to support seamless deployment for our largest enterprise customers

  • Automate infrastructure provisioning and management using Terraform & Python

  • Collaborate with development teams to optimize cloud resources and enhance system reliability

  • Develop and maintain monitoring and alerting systems to proactively identify and resolve issues affecting the reliability of our writing solutions

  • Conduct post-mortem analyses of system failures to identify root causes and implement preventive measures

  • Optimize and scale our cloud infrastructure to support growing user demand and ensure cost efficiency

  • Ensure the security and compliance of our systems adhering to industry standards and regulations

  • Provide mentorship and technical guidance to junior engineers fostering a culture of reliability and continuous improvement

  • Stay current with emerging technologies and industry trends to continuously improve our site reliability practices

Is this you

  • Proven expertise in Site Reliability Engineering with a minimum of 7 years of hands-on experience

  • Deep understanding of system architecture and infrastructure design to ensure high availability and performance

  • Bachelors degree in Computer Science Engineering or a related technical field

  • Strong proficiency in programming languages such as Python Java Go for automation and monitoring

  • Experience with cloud platforms like AWS Azure or GCP and their respective services for scalable and resilient systems

  • Expertise in containerization technologies (e.g. Docker Kubernetes) and orchestration tools

  • Knowledge of monitoring and logging tools (e.g. Prometheus Grafana ELK Stack) to maintain system health and performance

  • Ability to lead and mentor junior engineers in best practices for reliability and system optimization

  • Excellent communication skills to collaborate effectively with cross-functional teams and stakeholders

  • Proactive approach to identifying and mitigating potential system failures and performance bottlenecks

  • Preferred skills & experience:

    • Software engineering expertise

    • Terraform

    • Python

    • Kubernetes

    • Scala

    • AWS/GCP

Benefits & perks (UK full-time employees):

Employment Type

Full-Time

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.