drjobs
Senior Site Reliability Engineer
drjobs
Senior Site Reliabil....
Quadcode
drjobs Senior Site Reliability Engineer العربية

Senior Site Reliability Engineer

Employer Active

1 Vacancy
The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs

Jobs by Experience

drjobs

3 - 4 years

Job Location

drjobs

Amman - Jordan

Monthly Salary

drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Nationality

Any Nationality

Gender

N/A

Vacancy

1 Vacancy

Job Description

Req ID : 2479770

Senior Site Reliability EngineerTech stack

  • OS: Linux Ubuntu;
  • Web server: Nginx;
  • Monitoring: Grafana, Prometheus, Graylog, Jaeger;
  • CI/CD: Jenkins, Git, Gitlab, Docker;
  • Automation: Python, Bash;
  • SCM: Ansible, Chef;
  • IaC: Terraform. Pulumi;
  • DB: PostgreSQL, Redis, Keydb, MySQL;
  • Cloud: Openstack, AWS, GCP, DO.

Examples of first tasks in the role:

  • Review processes, platform and infrastructure;
  • Implementation of Grafana OnCall;
  • Review and rework ITSM processes if needed.

Responsibilities in the role:

  • Identification of bottlenecks and preparation of recommendations to improve the reliability of services;
  • Responding to platform emergencies, localizing and resolving the causes of failures, compiling postmortem reports;
  • Development of monitoring and alerting tools ensuring high availability and quick detection of potential issues: (Grafana, Grafana OnCall, Prometheus Alert manager, etc.);
  • Active participation in change management processes, including assessment and coordination of changes to the infrastructure within Change Advisory Board (CAB) sessions;
  • Implementation and support of ITSM processes to optimize team workflow and enhance service quality.
  • Development and maintenance of documentation in an up-to-date state.

Requirements:

  • 3+ years of experience in SRE/DevOps;
  • Understanding of SRE principles, practical experience in implementing SRE practices;
  • Understanding of principles and practical experience in building resilient systems;
  • Experience with monitoring and logging systems (Prometheus, Graylog, Grafana).
  • Experience with automation tools for software build and deployment (CI/CD): GitLab, Jenkins;
  • Understanding of virtualization and containerization principles;
  • Understanding of Infrastructure as Code (IaC) approaches and experience;
  • Proficiency in a programming language for automation script development (Python, Nodejs, Golang, etc.), ability to understand service code;
  • Understanding of network protocols, topologies, and network models;
  • Experience with configuration management tools: Ansible, Chef;
  • Basic experience with relational databases, such as PostgreSQL;
  • Experience in administering Linux operating systems;
  • Fluency in English and Russian (B2 minimum).

As an advantage:

  • Experience in implementing monitoring and logging systems from scratch;
  • Experience with k8s, Openstack;
  • Advanced programming skills in any language.

Employment Type

Full Time

Department / Functional Area

Engineering

Key Skills

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.