drjobs System Reliability Engineer II العربية

System Reliability Engineer II

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Riyadh - Saudi Arabia

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

We dont think about job roles in a traditional way. We are anti-silo. Anti-career stagnation. Anti-conventional.

Beyond ONE is a digital services provider radically reshaping the personalised digital ecosystems of consumers in high growth markets around the world. Were building a digital services aggregator platform with a strong telco foundation and a profitable growth strategy that empowers users to drive their own experiencesubscribe once source from many and only pay for what you actually use.

Since being founded in 2021 weve acquired Virgin Mobile MEA Friendi Mobile MEA and Virgin Mobile LATAM (with 6.5 million subscribers) and 1600 dedicated colleagues across Chile Colombia KSA Kuwait Mexico Oman and UAE.

To disrupt for good takes a rebellious spirit a questioning mind and a warm heart. We really care about how to get things done and not who manages who. We benefit from our diversity and together we disrupt the way we and others thinkin about our lives for good.

Do you want to exchange ideas learn from each other and leave your mark on our journey This is the place for you.

Role Purpose
Why this role matters: As a Site Reliability Engineer (SRE) you will play a key role in enhancing system reliability scalability and performance through automation monitoring and operational excellence. Your contributions will help shape our reliability engineering practices and platform stability ultimately transforming how we deliver resilient and scalable services to users.

What success looks like: In your first year you will:

  • Build and maintain automated systems to improve service uptime and incident response.
  • Implement and refine monitoring and alerting strategies to proactively detect issues.
  • Drive operational efficiencies by reducing toil and introducing reliability-focused tooling.

Why this is for you: If youre keen on solving availability latency and performance issues at scale hit us up. Were looking for someone ready to tackle this challenge head-on and make an impact from day one.

Key Responsibilities
In this role you will:

  • Lead the development of resilient highly available systems and incident response strategies.
  • Collaborate with software and infrastructure teams driving reliability and observability initiatives.
  • Manage production infrastructure and environments ensuring optimal performance and uptime.
  • Automate operational tasks using infrastructure-as-code and scripting tools.
  • Design and maintain monitoring and alerting systems using Prometheus Grafana or similar.
  • Conduct blameless postmortems and implement learnings to prevent future incidents.
  • Implement SLOs SLIs and error budgets to guide engineering decisions.
  • Optimize CI/CD pipelines and deployment processes for reliability and speed.
  • Engage with stakeholders to align reliability goals with business outcomes.

Qualifications & Attributes
Were seeking someone who embodies the following:

Education: Bachelors degree in Computer Science Engineering or a related field.
Experience: 3 years in Site Reliability Engineering DevOps or similar operational roles.

Technical Skills:
Must-haves:

  • Strong background in Linux/Unix systems and network administration.
  • Experience with cloud platforms (AWS Azure or GCP).
  • Experience implementing SLOs SLIs and error budget policies.
  • Proficiency in infrastructure automation (Terraform Ansible) and scripting (Python Go or Bash).
  • Deep understanding of monitoring observability and incident management tools (Prometheus Grafana Splunk etc.).
  • Solid grasp of CI/CD practices containerization (Docker) and orchestration (Kubernetes).

Nice-to-haves:

  • Familiarity with distributed systems service meshes and performance tuning.

Unique Attributes:

  • Thrives in fast-paced environments requiring quick decision-making.
  • Possesses a proactive mindset and a calm analytical approach to troubleshooting under pressure.
  • Excels with SRE best practices modern ops philosophies and large-scale system thinking.

What we offer:

  • Rapid learning opportunities - we enable learning through flexible career paths exposure to challenging & meaningful work that will help build and strengthen your expertise.
  • Hybrid work environment - flexibility to work from home 2 days a week.
  • Healthcare and other local benefits offered in market.

By submitting your application you acknowledge and consent to the use of Greenhouse & BrightHire during the recruitment process. This may include the storage and processing of your data on servers located outside your country of residence. For further information please contact us at

Employment Type

Full Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.