Responsibilities:
Design and build high-availability architecture on Kubernetes clusters including handling upgrades capacity planning and cost optimization.
Develop maintain and manage tools to automate operational activities and enhance engineering productivity.
Maintain our monitoring stack (VictoriaMetrics Jaeger Grafana) build alerts to detect anomalies before reaching users and handle incident response and post-mortem analysis to improve system availability.
Ensuring the database reliability including handling replication failover and performance tuning on various datastore (PostgreSQL MySQL MongoDB Elasticsearch Kafka Redis and Memcache).
Manage all infrastructure using IaC (Opentofu Ansible ArgoCD Helm). Develop a modular and reusable IaC for the entire stack.
Develop optimize and maintain GitHub Actions GitLab CI and CircleCI workflows to ensure fast and reliable delivery.
Ensure a secure production environment including managing secrets firewalls and access controls.
Analyze infrastructure utilization and ensure high availability of the infrastructure while minimizing cost.
Update track and resolve technical issues in a timely manner.
Explore and integrate AI-driven insights into operational processes to improve reliability reduce noise and empower engineering teams with intelligent decision-making.
Qualifications :
Qualifications:
At least 4 years of experience in SRE DevOps Systems Engineering or Software Engineer with relevant experience working in system engineering.
Strong experience in building scalable systems. Experience with microservices and exposure to highly available web scale systems is a must.
Knowledge of modern tech stacks. Comfortable with writing tools and automation in Bash Python or Go.
Deep experience with storage technologies including databases (Postgres and MySQL) document stores (MongoDB) caching systems (Redis memcache) Queue and Pubsub (Kafka) and search engines (Elasticsearch).
Experience of working with cloud environments (GCP and AWS GCP certified is plus).
Hands-on experience with containerization and orchestration (Docker Kubernetes).
Strong experience with CI/CD tools such as GitHub Actions Ansible Terraform/Opentofu Helm and ArgoCD.
User obsession and empathy.
Focus on impact and results. You work on the right things and get them done.
Drive and resourcefulness to persevere and overcome obstacles achieving challenging goals.
High integrity and ability to positively collaborate with others.
Additional Information :
We are looking for a strong DevOps / SRE Engineer to join the Laku6 part of Carousell Group. You will be the operational backbone of our marketplace focusing on the reliability scalability and performance of our infrastructure. You will manage our GKE clusters and databases while leveraging the central platform tools to drive operational excellence.
100% Remote Work from Home
By proceeding with your application you are adhering to our PDPA case you are interested to know more read about our Candidates Personal Data Privacy Statement.
Remote Work :
Yes
Employment Type :
Full-time
Carousell Group is the leading multi-category platform for secondhand in Greater Southeast Asia on a mission to make secondhand the first choice. Founded in August 2012 in Singapore, the Group has a leading presence in seven markets under the brands Carousell, Carousell Media Group, C ... View more