drjobs System Reliability Engineer II English

System Reliability Engineer II

صاحب العمل نشط

1 وظيفة شاغرة
drjobs

حالة تأهب وظيفة

سيتم تحديثك بأحدث تنبيهات الوظائف عبر البريد الإلكتروني
Valid email field required
أرسل الوظائف
drjobs
أرسل لي وظائف مشابهة
drjobs

حالة تأهب وظيفة

سيتم تحديثك بأحدث تنبيهات الوظائف عبر البريد الإلكتروني

Valid email field required
أرسل الوظائف
موقع الوظيفة drjobs

الرياض - السعودية

الراتب شهرياً drjobs

لم يكشف

drjobs

لم يتم الكشف عن الراتب

عدد الوظائف الشاغرة

1 وظيفة شاغرة

الوصف الوظيفي

We dont think about job roles in a traditional way. We are anti-silo. Anti-career stagnation. Anti-conventional.

Beyond ONE is a digital services provider radically reshaping the personalised digital ecosystems of consumers in high growth markets around the world. Were building a digital services aggregator platform with a strong telco foundation and a profitable growth strategy that empowers users to drive their own experiencesubscribe once source from many and only pay for what you actually use.

Since being founded in 2021 weve acquired Virgin Mobile MEA Friendi Mobile MEA and Virgin Mobile LATAM (with 6.5 million subscribers) and 1600 dedicated colleagues across Chile Colombia KSA Kuwait Mexico Oman and UAE.

To disrupt for good takes a rebellious spirit a questioning mind and a warm heart. We really care about how to get things done and not who manages who. We benefit from our diversity and together we disrupt the way we and others thinkin about our lives for good.

Do you want to exchange ideas learn from each other and leave your mark on our journey This is the place for you.

Role Purpose
Why this role matters: As a Site Reliability Engineer (SRE) you will play a key role in enhancing system reliability scalability and performance through automation monitoring and operational excellence. Your contributions will help shape our reliability engineering practices and platform stability ultimately transforming how we deliver resilient and scalable services to users.

What success looks like: In your first year you will:

  • Build and maintain automated systems to improve service uptime and incident response.
  • Implement and refine monitoring and alerting strategies to proactively detect issues.
  • Drive operational efficiencies by reducing toil and introducing reliability-focused tooling.

Why this is for you: If youre keen on solving availability latency and performance issues at scale hit us up. Were looking for someone ready to tackle this challenge head-on and make an impact from day one.

Key Responsibilities
In this role you will:

  • Lead the development of resilient highly available systems and incident response strategies.
  • Collaborate with software and infrastructure teams driving reliability and observability initiatives.
  • Manage production infrastructure and environments ensuring optimal performance and uptime.
  • Automate operational tasks using infrastructure-as-code and scripting tools.
  • Design and maintain monitoring and alerting systems using Prometheus Grafana or similar.
  • Conduct blameless postmortems and implement learnings to prevent future incidents.
  • Implement SLOs SLIs and error budgets to guide engineering decisions.
  • Optimize CI/CD pipelines and deployment processes for reliability and speed.
  • Engage with stakeholders to align reliability goals with business outcomes.

Qualifications & Attributes
Were seeking someone who embodies the following:

Education: Bachelors degree in Computer Science Engineering or a related field.
Experience: 3 years in Site Reliability Engineering DevOps or similar operational roles.

Technical Skills:
Must-haves:

  • Strong background in Linux/Unix systems and network administration.
  • Experience with cloud platforms (AWS Azure or GCP).
  • Experience implementing SLOs SLIs and error budget policies.
  • Proficiency in infrastructure automation (Terraform Ansible) and scripting (Python Go or Bash).
  • Deep understanding of monitoring observability and incident management tools (Prometheus Grafana Splunk etc.).
  • Solid grasp of CI/CD practices containerization (Docker) and orchestration (Kubernetes).

Nice-to-haves:

  • Familiarity with distributed systems service meshes and performance tuning.

Unique Attributes:

  • Thrives in fast-paced environments requiring quick decision-making.
  • Possesses a proactive mindset and a calm analytical approach to troubleshooting under pressure.
  • Excels with SRE best practices modern ops philosophies and large-scale system thinking.

What we offer:

  • Rapid learning opportunities - we enable learning through flexible career paths exposure to challenging & meaningful work that will help build and strengthen your expertise.
  • Hybrid work environment - flexibility to work from home 2 days a week.
  • Healthcare and other local benefits offered in market.

By submitting your application you acknowledge and consent to the use of Greenhouse & BrightHire during the recruitment process. This may include the storage and processing of your data on servers located outside your country of residence. For further information please contact us at

نوع التوظيف

دوام كامل

نبذة عن الشركة

الإبلاغ عن هذه الوظيفة
إخلاء المسؤولية: د.جوب هو مجرد منصة تربط بين الباحثين عن عمل وأصحاب العمل. ننصح المتقدمين بإجراء بحث مستقل خاص بهم في أوراق اعتماد صاحب العمل المحتمل. نحن نحرص على ألا يتم طلب أي مدفوعات مالية من قبل عملائنا، وبالتالي فإننا ننصح بعدم مشاركة أي معلومات شخصية أو متعلقة بالحسابات المصرفية مع أي طرف ثالث. إذا كنت تشك في وقوع أي احتيال أو سوء تصرف، فيرجى التواصل معنا من خلال تعبئة النموذج الموجود على الصفحة اتصل بنا