Our esteemed client an established MNC is searching for a Site Reliability Engineer:
Job Responsibilities
- Oversee observability capacity planning issue analysis and troubleshooting for largescale cloudnative applications in a microservices architecture.
- Debug and automate routine tasks across operating systems networks databases and application servers leveraging programming skills beyond basic scripting.
- Apply DevOps processes and programming knowledge in at least one of the following languages: Java Python or Go.
- Utilize scripting tools such as Shell Terraform Ansible Chef or Puppet for automation and infrastructure management.
- Possess deep expertise in Unix/Linux systems virtual machines containers container management systems enterprise cloud platforms and data structures.
- Manage the lifecycle of servicesfrom launch to deployment operation and optimization ensuring reliability and a seamless user experience.
- Monitor and enhance service reliability by measuring availability latency and system health while implementing sustainable incident response strategies.
- Gather and analyze metrics to optimize performance and troubleshoot prioritylevel (P0/P1/P2/P3) issues.
- Contribute to system design recommendations platform management and balancing feature development speed with reliability based on service level objectives.
- Continuously measure and optimize system performance anticipating and addressing potential user needs while driving innovation and improvements.
Job Requirements:
- Bachelors degree or higher in Computer Science Electronics & Communication or a related field.
- Minimum 2 years experience in related field.
- Strong understanding of SRE principles and DevOps processes.
- Exposure to datadriven decisionmaking and trend analysis.
- Experience designing automation frameworks using SaltStack Spinnaker or StackStorm.
- Managing largescale big data clusters and optimizing data processing efficiency.
- Knowledge of Chaos Engineering principles for system resilience testing.
- Expertise in largescale container management platforms with autoscaling and intelligent scheduling.
- Experience in big data analysis data science or largescale data development.
- Understanding of SIEM (Security Information and Event Management) threat modeling and vulnerability detection.
- Handson experience in cloud services network design policy creation and performance tuning.
- Proficiency in database consistency checks slow query optimization and middleware performance tuning for RDBMS NoSQL and distributed caches.
Additional Information:
- Salary: Up to MYR 9000
- Working Location: Cyberjaya MY
- Working Hours: Monday to Friday 9am 6pm
- 1 year renewable contract.
For interested parties kindly click on APPLY NOW or send in your resume in MS Word format to
*We regret that only shortlisted candidates will be notified*
TSTAR Recruit Pte Ltd EA Licence No:22C1039 Co.Reg.No.Z EA Registration No.: R(SIA KAI SING)
Salary package:
MYR 5000 to MYR 7000 if exp is 25 yrs
(OR) around MYR 7000 to MYR 9000 if 5 yrs exp