Senior System Reliability Engineer
Job Summary
We dont think about job roles in a traditional way. We are anti-silo. Anti-career stagnation. Anti-conventional.
Beyond ONE is a digital services provider radically reshaping the personalised digital ecosystems of consumers in high growth markets around the world. Were building a digital services aggregator platform with a strong telco foundation and a profitable growth strategy that empowers users to drive their own experiencesubscribe once source from many and only pay for what you actually use.
Since being founded in 2021 weve acquired Virgin Mobile MEA Friendi Mobile MEA and Virgin Mobile LATAM (with 6.5 million subscribers) and 1600 dedicated colleagues across Chile Colombia KSA Kuwait Mexico Oman Pakistan and UAE.
To disrupt for good takes a rebellious spirit a questioning mind and a warm heart. We really care about how to get things done and not who manages who. We benefit from our diversity and together we disrupt the way we and others thinking about our lives for good.
Do you want to exchange ideas learn from each other and leave your mark on our journey This is the place for you.
Role Purpose
Why this role matters:As System Reliability Engineer-II you will play a key role in improving the reliability scalability and performance of critical platform services. Your contributions will help shape our site reliability engineering practices and infrastructure andultimately theway we disrupt the market.
What success looks like:In your first year you will lead key automation and observability initiatives reduce manual operational tasks through tooling and infrastructure-as-code and improve system uptime through proactive incident management and performance tuning.
Why this is for you:Ifyourekeen on solving the challenge of scaling complex systems reliably and efficiently hit us for someone ready to tackle this challenge head-on and make an impact from day one.
Key Responsibilities
In this role you will:
- Lead the development of automation and tooling to improve deployment monitoring and incident resolution ensuring greater system efficiency and scalability.
- Collaborate with platform product and security teams driving the adoption of reliability best practices across services.
- Manage on-call responsibilities and incident management processes ensuring rapid detection resolution and follow-up improvements.
- Drive the implementation of observability strategies including metrics logging and alerting.
- Contribute to infrastructure-as-code self-healing systems and continuous improvement in system performance and availability.
- Participate in architectural reviews and chaos engineering initiatives to build resilient systems.
Qualifications & Attributes
Were seeking someone who embodies the following:
Education:Bachelors degree in Computer Science Engineering or a related field or equivalent practical experience.
Experience: 6 years in Site Reliability Engineering DevOps or Infrastructure roles especially in high-availability large-scale environments.
Technical Skills:
Must-haves:
- Proficiency in programming/scripting (e.g. Python Go Bash)
- Strong Linux and networking fundamentals
- Experience with cloud platforms (AWS GCP or Azure) Kubernetes
- Experience with observability tools like Prometheus Grafana or ELKTerraform
Nice-to-haves:
- Familiarity with incident response tools performance testing and capacity planning practices.
- Knowledge and hands-on experience with Virtualisation is a plus.
Unique Attributes:
- Thrives in high-scale environments requiring proactive ownership and rapid decision-making.
- Possessesstrong communicationskills and a pragmatic collaborative mindset.
- Excels withDevOps principles infrastructure-as-code and reliability engineering practices.
What we offer:
- Rapid learning opportunities - we enable learning through flexible career paths exposure to challenging & meaningful work that will help build and strengthen your expertise.
- Hybrid work environment - flexibility to work from home 2 days a week in our UAE & Pakistan offices.
- Healthcare and other local benefits offered in market.
By submitting your application you acknowledge and consent to the use of Greenhouse & BrightHire during the recruitment process. This may include the storage and processing of your data on servers located outside your country of residence. For further information please contact us at
Required Experience:
Senior IC
Key Skills
- Kubernetes
- FMEA
- Continuous Improvement
- Elasticsearch
- Go
- Root cause Analysis
- Maximo
- CMMS
- Maintenance
- Mechanical Engineering
- Manufacturing
- Troubleshooting