DescriptionJoin a dynamic team shaping the tech backbone of our operations where your expertise fuels seamless system functionality and innovation.
As a Site Reliability Engineer II at JPMorgan Chase within the Commercial and Investment Bank you will play a vital role in ensuring the operational stability availability and performance of our production application flows. Your efforts in troubleshooting maintaining identifying escalating and resolving production service interruptions for all internally and externally developed systems support a seamless user experience and a culture of continuous improvement.
Job responsibilities
- Analyze and troubleshoot production application flows to ensure end-to-end application or infrastructure service delivery supporting the business operations of the firm
- Improve operational stability and availability through participation in problem management
- Must be able to multi-task in a complex production environment and quickly acquire broad knowledge of applications
- Monitor production environments for anomalies and address issues utilizing standard observability tools
- Assist in the escalation and communication of issues and solutions to the business and technology stakeholders
- Identify trends and assist in the management of incidents problems and changes in support of full stack technology systems applications or infrastructure
Required qualifications capabilities and skills
- Formal training or certification on Site Reliability concepts and 2 years applied experience
- Experience in troubleshooting resolving and maintaining information technology services
- Knowledge of applications or infrastructure in a large-scale technology environment on premises or public cloud
- Exposure to observability and monitoring tools and techniques
- Familiarity with processes in scope of the Information Technology Infrastructure Library (ITIL) framework
- Basic understanding of DevOps and SRE methodologies or processes is a must
Preferred qualifications capabilities and skills
- Knowledge of one or more general purpose programming languages or automation scripting
- Understanding of the design and implementation of proactive monitoring and health checks
- Strong technical skills (e.g. Containers SQL scripting)