Site Reliability Engineer

Arvion Services

Not Interested
Bookmark
Report This Job

profile Job Location:

Kuala Lumpur - Malaysia

profile Monthly Salary: Not Disclosed
Posted on: 11 hours ago
Vacancies: 1 Vacancy

Job Summary

Job Overview

We are seeking a Site Reliability Engineer (SRE) to support large-scale distributed and fault-tolerant systems for a global technology environment. This role combines software engineering and systems operations to improve system reliability scalability automation and performance.

What Will You Do:

  • Design build and maintain scalable and highly available infrastructure systems.
  • Develop automation tools and scripts to improve operational efficiency.
  • Monitor system performance and troubleshoot infrastructure issues proactively.
  • Implement monitoring alerting SLIs SLOs and SLA tracking.
  • Participate in 24/7 on-call rotations and incident response activities.
  • Conduct root cause analysis and support post-mortem reviews.
  • Collaborate with engineering and cross-functional teams on system improvements.
  • Ensure infrastructure security compliance and reliability best practices.
  • Support containerized environments using Docker and Kubernetes.

What Makes You A Good Fit:

  • Bachelors or Masters Degree in Computer Science IT Engineering or related field.
  • Minimum 3 years of experience in SRE Systems Engineering or Software Engineering.
  • Proficient in programming languages such as Python Go Java or C.
  • Strong Linux systems and networking knowledge.
  • Experience with Docker Kubernetes Prometheus and Grafana is preferred.
  • Knowledge of relational databases and system architecture.
  • Strong analytical troubleshooting and communication skills.

What We Offer:

  • Opportunity to work on large-scale global infrastructure systems.
  • Exposure to advanced cloud automation and reliability engineering practices.
  • Career growth within a dynamic technology environment.
  • Collaborative and fast-paced team culture.


Job Overview We are seeking a Site Reliability Engineer (SRE) to support large-scale distributed and fault-tolerant systems for a global technology environment. This role combines software engineering and systems operations to improve system reliability scalability automation and performance. What W...
View more view more