Site Reliability Engineer

Velenosi&Meredith

Not Interested
Bookmark
Report This Job

profile Job Location:

George Town - Malaysia

profile Monthly Salary: Not Disclosed
Posted on: 9 hours ago
Vacancies: 1 Vacancy

Job Summary

Overview

We are seeking an experienced Site Reliability Engineer (SRE) to join a dynamic technology team supporting large-scale infrastructure and AML systems. This role combines software engineering systems engineering automation and operational excellence to ensure high availability scalability and reliability across critical platforms.

The ideal candidate is passionate about infrastructure automation system performance cloud-native technologies and operational reliability in fast-paced environments.

Key Responsibilities

  • Design build and maintain highly available scalable and fault-tolerant systems
  • Collaborate closely with software engineering teams to improve system reliability and performance
  • Develop and maintain automation tools and operational procedures to improve efficiency and reduce manual intervention
  • Monitor infrastructure and application performance to proactively identify and resolve issues
  • Implement and maintain monitoring alerting and observability solutions including SLIs SLOs and SLAs
  • Participate in 24/7 on-call rotations incident management root-cause analysis and blameless post-mortems
  • Ensure infrastructure security compliance and operational best practices
  • Support large-scale web traffic and machine learning data processing environments

Requirements

Technical Skills

  • Proficiency in at least one programming language such as Python Go Java or C
  • Strong scripting and automation skills
  • Good understanding of Linux operating systems and network architecture
  • Experience with Docker and Kubernetes
  • Hands-on experience with monitoring tools such as Prometheus and Grafana
  • Knowledge of relational databases and database modeling

Preferred Skills

  • Exposure to machine learning frameworks such as TensorFlow PyTorch MXNet or PaddlePaddle
  • Strong analytical and problem-solving abilities
  • Excellent communication and collaboration skills
  • Ability to work effectively in a fast-paced and cross-functional environment

Qualifications

  • Bachelors or Masters Degree in Computer Science Information Technology Computer Engineering or related field
  • Minimum 3 years of experience in Site Reliability Engineering Systems Engineering or Software Engineering

Why Join Us

  • Opportunity to work on large-scale distributed systems and modern infrastructure technologies
  • Exposure to cloud-native environments and advanced automation practices
  • Collaborative and technology-driven working environment
  • Career growth and continuous learning opportunities
  • Competitive salary and benefits package
Overview We are seeking an experienced Site Reliability Engineer (SRE) to join a dynamic technology team supporting large-scale infrastructure and AML systems. This role combines software engineering systems engineering automation and operational excellence to ensure high availability scalability an...
View more view more