Kyribas Cloud Operations team safeguards the availability performance and reliability of our industry-leading SaaS platformpowering solutions in payments risk management business intelligence and fraud prevention. We manage Linux and Windows workloads on AWS across APAC EMEA and AMER regions applying a software engineering mindset to systems administration with a strong focus on automation observability and continuous improvement.
Were looking for a passionate and skilled Cloud Operations Engineer to join our global Cloud Operations team in this role youll help ensure the availability stability and performance of our industry-leading SaaS platform running on AWS and serving enterprise customers worldwide. Youll work at the intersection of operations automation and reliability using Infrastructure as Code and modern tooling to drive operational excellence and continuous improvement.
Key Responsibilities
- Ensure the availability reliability and performance of production platforms and applications.
- Troubleshoot issues perform Root Cause Analysis (RCA) and automate remediation for recurring problems.
- Orchestrate and deploy infrastructure and services on AWS using Infrastructure as Code principles.
- Build and maintain deployment automation using Ansible enhancing existing playbooks roles and CI/CD integrations.
- Develop tooling for failure detection remediation OS patching and deployments.
- Maintain and improve documentation standard operating procedures (SOPs) and architecture diagrams.
- Patch upgrade and harden systems; drive performance and resource optimization.
- Coordinate incident responses communicate with impacted teams and ensure timely resolution.
- Participate in rotational on-call duties and support Saturday deployments as needed
Qualifications :
- 25 years of experience in Linux or Windows server administration (SaaS or production environments preferred).
- Strong Linux administration skills; working knowledge of Windows Server.
- Hands-on experience with AWS (EC2 RDS S3 IAM CloudWatch VPC Systems Manager etc.).
- Proven automation experience with Ansible (strongly preferred).
- Scripting skills for automation using Python shell or PowerShell.
- Strong communication and documentation skills in English.
- Demonstrates thoroughness critical thinking and sound judgment when troubleshooting and implementing solutions.
- Communicates clearly manages pressure calmly and works effectively with global teams and stakeholders to achieve win-win results.
- Bonus points for experience with Kubernetes (EKS) Terraform Docker complex troubleshooting ITIL/incident management or Oracle RDBMS.
Additional Information :
Remote Work :
No
Employment Type :
Full-time
Kyribas Cloud Operations team safeguards the availability performance and reliability of our industry-leading SaaS platformpowering solutions in payments risk management business intelligence and fraud prevention. We manage Linux and Windows workloads on AWS across APAC EMEA and AMER regions applyin...
Kyribas Cloud Operations team safeguards the availability performance and reliability of our industry-leading SaaS platformpowering solutions in payments risk management business intelligence and fraud prevention. We manage Linux and Windows workloads on AWS across APAC EMEA and AMER regions applying a software engineering mindset to systems administration with a strong focus on automation observability and continuous improvement.
Were looking for a passionate and skilled Cloud Operations Engineer to join our global Cloud Operations team in this role youll help ensure the availability stability and performance of our industry-leading SaaS platform running on AWS and serving enterprise customers worldwide. Youll work at the intersection of operations automation and reliability using Infrastructure as Code and modern tooling to drive operational excellence and continuous improvement.
Key Responsibilities
- Ensure the availability reliability and performance of production platforms and applications.
- Troubleshoot issues perform Root Cause Analysis (RCA) and automate remediation for recurring problems.
- Orchestrate and deploy infrastructure and services on AWS using Infrastructure as Code principles.
- Build and maintain deployment automation using Ansible enhancing existing playbooks roles and CI/CD integrations.
- Develop tooling for failure detection remediation OS patching and deployments.
- Maintain and improve documentation standard operating procedures (SOPs) and architecture diagrams.
- Patch upgrade and harden systems; drive performance and resource optimization.
- Coordinate incident responses communicate with impacted teams and ensure timely resolution.
- Participate in rotational on-call duties and support Saturday deployments as needed
Qualifications :
- 25 years of experience in Linux or Windows server administration (SaaS or production environments preferred).
- Strong Linux administration skills; working knowledge of Windows Server.
- Hands-on experience with AWS (EC2 RDS S3 IAM CloudWatch VPC Systems Manager etc.).
- Proven automation experience with Ansible (strongly preferred).
- Scripting skills for automation using Python shell or PowerShell.
- Strong communication and documentation skills in English.
- Demonstrates thoroughness critical thinking and sound judgment when troubleshooting and implementing solutions.
- Communicates clearly manages pressure calmly and works effectively with global teams and stakeholders to achieve win-win results.
- Bonus points for experience with Kubernetes (EKS) Terraform Docker complex troubleshooting ITIL/incident management or Oracle RDBMS.
Additional Information :
Remote Work :
No
Employment Type :
Full-time
View more
View less