Cloud Operations Engineer (IGT1 Kyriba)

Colombo - Sri Lanka

Monthly Salary: Not Disclosed

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

Kyribas Cloud Operations team safeguards the availability performance and reliability of our industry-leading SaaS platformpowering solutions in payments risk management business intelligence and fraud prevention. We manage Linux and Windows workloads on AWS across APAC EMEA and AMER regions applying a software engineering mindset to systems administration with a strong focus on automation observability and continuous improvement.

Were looking for a passionate and skilled Cloud Operations Engineer to join our global Cloud Operations team in this role youll help ensure the availability stability and performance of our industry-leading SaaS platform running on AWS and serving enterprise customers worldwide. Youll work at the intersection of operations automation and reliability using Infrastructure as Code and modern tooling to drive operational excellence and continuous improvement.

Key Responsibilities

Ensure the availability reliability and performance of production platforms and applications.
Troubleshoot issues perform Root Cause Analysis (RCA) and automate remediation for recurring problems.
Orchestrate and deploy infrastructure and services on AWS using Infrastructure as Code principles.
Build and maintain deployment automation using Ansible enhancing existing playbooks roles and CI/CD integrations.
Develop tooling for failure detection remediation OS patching and deployments.
Maintain and improve documentation standard operating procedures (SOPs) and architecture diagrams.
Patch upgrade and harden systems; drive performance and resource optimization.
Coordinate incident responses communicate with impacted teams and ensure timely resolution.
Participate in rotational on-call duties and support Saturday deployments as needed

Qualifications :

25 years of experience in Linux or Windows server administration (SaaS or production environments preferred).
Strong Linux administration skills; working knowledge of Windows Server.
Hands-on experience with AWS (EC2 RDS S3 IAM CloudWatch VPC Systems Manager etc.).
Proven automation experience with Ansible (strongly preferred).
Scripting skills for automation using Python shell or PowerShell.
Strong communication and documentation skills in English.
Demonstrates thoroughness critical thinking and sound judgment when troubleshooting and implementing solutions.
Communicates clearly manages pressure calmly and works effectively with global teams and stakeholders to achieve win-win results.
Bonus points for experience with Kubernetes (EKS) Terraform Docker complex troubleshooting ITIL/incident management or Oracle RDBMS.

Additional Information :

Remote Work :

Employment Type :

Full-time

Key Responsibilities

Ensure the availability reliability and performance of production platforms and applications.
Troubleshoot issues perform Root Cause Analysis (RCA) and automate remediation for recurring problems.
Orchestrate and deploy infrastructure and services on AWS using Infrastructure as Code principles.
Build and maintain deployment automation using Ansible enhancing existing playbooks roles and CI/CD integrations.
Develop tooling for failure detection remediation OS patching and deployments.
Maintain and improve documentation standard operating procedures (SOPs) and architecture diagrams.
Patch upgrade and harden systems; drive performance and resource optimization.
Coordinate incident responses communicate with impacted teams and ensure timely resolution.
Participate in rotational on-call duties and support Saturday deployments as needed

Qualifications :

25 years of experience in Linux or Windows server administration (SaaS or production environments preferred).
Strong Linux administration skills; working knowledge of Windows Server.
Hands-on experience with AWS (EC2 RDS S3 IAM CloudWatch VPC Systems Manager etc.).
Proven automation experience with Ansible (strongly preferred).
Scripting skills for automation using Python shell or PowerShell.
Strong communication and documentation skills in English.
Demonstrates thoroughness critical thinking and sound judgment when troubleshooting and implementing solutions.
Communicates clearly manages pressure calmly and works effectively with global teams and stakeholders to achieve win-win results.
Bonus points for experience with Kubernetes (EKS) Terraform Docker complex troubleshooting ITIL/incident management or Oracle RDBMS.

Additional Information :

Remote Work :

Employment Type :

Full-time

Key Skills

Change Management
Software Deployment
Cloud Infrastructure
High Availability
IaaS
Firewall
Linux
Middleware
Jboss
Network Architecture
Scripting
Technical Support

Apply Now

About Company

IFS

We are growing! At IFS we are constantly growing to deliver award-winning solutions to hundreds of partners and thousands of customers worldwide! We help companies who want to be their best when it matters most at their #momentofservice. Visit https://ifs.link/IzM0px to find out mo ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click