drjobs Lead Site Reliability Engineer

Lead Site Reliability Engineer

Employer Active

1 Vacancy
The job posting is outdated and position may be filled
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Bangkok - Thailand

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Job Description

Tackle applications services under responsible areas to ensure BAU stabilization and meet expected incident SLA and system availability level defined per on/off peak time/period and be able to apply workaround solutions by modifying current source code.

Performs root cause analysis (RCA) by doing deep code analysis to immediate troubleshoot issues and perform issue resolution (short term. Medium term and long term) within incident SLA along with proactive/reactive action.

Perform BAU system set up bug fixing & small CRs with IT implementation methodology (build test deploy) aligned to company security and business objectives and strategy.

Manage regular system patch upgrade with product owner & business stakeholders.

Manage monitoring tools by creating scripts robot or AI and ensure no business disruption.

Manage support workbook and control. Ensure knowledge base has been well organized and keep uptodate.

Be familiar with REST API (Syncronous Process) Message Producer/Consumer Process (Async Process) and Batch process.

Be familiar withe of Opensource Monitoring Tools such as ELK stack Grafarna

Be familiar with Container Technology such as Docker K8S

Be familiar with Cloud Technolgy such AWS Azure and Tencent cloud.

Qualification

Bachelors in Computer Science or related field

13 years in SRE or Support Engneer

Strong in programming (Java Go)basic SQL Linux/Unix Scripting Cloud platforms (AWS Azure Tencent Cloud).

Handson experience with Docker Kubernetes( K8S) including deploying scaling and troubleshooting.

Skilled in diagnosing and resolving issues quickly with experience in root cause analysis and incident response.

Knowledge of SLAs SLOs and automation to improve system reliability and reduce manual intervention.

Good English proficiency

Employment Type

Full Time

Company Industry

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.