Employer Active
Job Alert
You will be updated with latest job alerts via emailJob Alert
You will be updated with latest job alerts via emailPropel operational success with your expertise in technology support and a commitment to continuous improvement.
As a Site Reliability Engineer III team member in Securities Services Technology you will ensure the operational stability availability and performance of our production application flows. Encourage a culture of continuous improvement as you troubleshoot maintain identify escalate and resolve production service interruptions for all internally and externally developed systems leading to a seamless user experience.
Job responsibilities
Provide business facing technology support to business and operations groups across the Asia Pacific region.
Work within a follow-the-sun support model with global counterparts.
Manage production technology incidents to resolution ensuring timely engagement escalation and effective communication to business technology and vendor partners.
Perform post incident analysis identifying tracking and implementing preventative measures.
Act as Subject Matter Expert (SME) for key applications responsible for maintaining global best practice and hygiene standards.
Assist in the monitoring of production environments for anomalies and address issues utilizing standard observability tools.
Act as a key contributor in the continued development of tools frameworks & techniques to improve productivity and quality of the production support adopting SRE principles to manage and support the environment.
Analyze complex situations and trends to anticipate and solve incident problem and change management in support of full stack technology systems applications or infrastructure.
Required qualifications capabilities and skills
6 years previous experience in Technology for Financial and Banking sector with expertise in troubleshooting resolving and maintaining information technology services.
Bachelors degree in Engineering Computer Science or Information Technology.
Proven track record of Production Support & Site Reliability Engineer (SRE): A clear understanding of SRE protocols and methodologies
Familiar with observability service level objective alerting and telemetry collection using tools such as Grafana Dynatrace Prometheus Datadog Splunk and others.
Support Management skills: design and use monitoring dashboards for day-to-day support generate service KPIs report on service stability & performance and log monitoring.
Proven track record of running Incident & Problem Management calls for business impacting outages performing post incident analysis identifying & implementing preventative measures and lessons learned following outages.
Excellent interpersonal relationship and communication skills along with strong analytical and problem-solving skills.
Able to drive issue resolution across different support teams.
Experience in debugging and maintaining applications in a large corporate environment with one or more modern programming languages and database querying languages.
Preferred qualifications capabilities and skills
Passion for learning new technologies and driving innovative solutions.
Experience with Kubernetes for container orchestration.
Scripting languages: Perl Python Linux/UNIX shell and Database: Oracle MS-SQL PostgreSQL no SQL DB (Casandra)
Telemetry & Application Performance monitoring tools such as: Splunk AppDynamics Dynatrace Grafana ITRS Geneos
Experience with core AWS services such as: EC2 S3 EKS RDS Cloudwatch (& DataDog)
Exposure to agile methodologies such as Continuous Integration (CI) and Continuous Delivery (CD) tools like Jenkins and Terraform
Experience in technology disaster recovery planning and test execution and prior experience in JAVA development is a plus.
Full-Time