Lead Software Engineer- Resiliency

JPMorganChase

Not Interested
Bookmark
Report This Job

profile Job Location:

Columbus, NE - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

Description

Be an integral part of an agile team thats constantly pushing the envelope to enhance build and deliver top-notch technology products.

As a Lead Site Reliability Engineer at JPMorgan Chase within the Employee Compute Branch Team you will play a pivotal role in designing implementing and overseeing automation for observability and notification across a diverse set of systems in a global Microsoft Windows environment. You will lead by example bringing hands-on expertise in PowerShell and C# and infusing best practices into a team of highly experienced system engineers. Your work will directly impact the reliability scalability and efficiency of our platforms with a strong focus on cloud (Azure and AWS) integration.

Job Responsibilities

  • Champion site reliability engineering culture and practices exerting technical influence across the team.
  • Lead the design and hands-on implementation of automated observability and notification solutions using PowerShell and C#.
  • Drive initiatives to improve reliability and stability of applications and platforms through data-driven analytics and automation.
  • Collaborate with team members to define and implement service level indicators objectives and error budgets.
  • Architect and implement monitoring alerting and telemetry solutions using tools such as Grafana Dynatrace Prometheus Datadog and Splunk.
  • Act as the primary technical lead during major incidents quickly identifying and resolving issues to minimize impact.
  • Mentor and upskill system engineers fostering a programming mindset and best practices in automation and reliability.
  • Facilitate cross-team and cross-region collaboration ensuring alignment and knowledge sharing.
  • Document and share technical solutions and best practices within internal forums and communities of practice.
  • Engage with stakeholders to understand business needs and translate them into technical solutions with increasing responsibility over time.
  • Break down complex problems into actionable work for the team ensuring clear direction and accountability.

Required qualifications capabilities and skills

  • Formal training or certification on Site Reliability Engineering concepts and 5 years applied experience
  • Deep proficiency in reliability scalability performance security enterprise system architecture and toil reduction with proven ability to implement these practices.
  • Expert-level fluency in PowerShell and C# in a Microsoft Windows environment.
  • Hands-on experience with cloud platforms specifically Azure and AWS.
  • Demonstrated experience in automated software testing (unit integration end-to-end).
  • Deep knowledge of software applications and technical processes with emerging depth in one or more technical disciplines.
  • Proficiency and experience in observability including white and black box monitoring SLO alerting and telemetry collection using tools such as Grafana Dynatrace Prometheus Datadog and Splunk.
  • Proficiency in continuous integration and continuous delivery tools (e.g. Jenkins GitLab Terraform).
  • Experience with containerization and container orchestration (e.g. Docker Kubernetes ECS).
  • Ability to mentor and teach programming concepts to system engineers with non-programming backgrounds fostering a programming mindset and best practices.
  • Excellent communication and strategic thinking skills with the ability to collaborate across teams regions and stakeholder groups.

Preferred qualifications capabilities and skills

  • Experience leading teams or projects in a site reliability or automation-focused role.
  • Experience in financial services or other highly regulated secure enterprise environments.
  • Experience with containerization and orchestration (e.g. Docker Kubernetes ECS).
  • Familiarity with complex data structures and algorithms.
  • Drive to self-educate and evaluate new technologies.
  • Ability to expand and collaborate across different levels and stakeholder groups.
  • Experience architecting self-healing or remediation automation (a plus but not required at this stage).


DescriptionBe an integral part of an agile team thats constantly pushing the envelope to enhance build and deliver top-notch technology products.As a Lead Site Reliability Engineer at JPMorgan Chase within the Employee Compute Branch Team you will play a pivotal role in designing implementing and ov...
View more view more

Key Skills

  • Spring
  • .NET
  • C/C++
  • Go
  • React
  • OOP
  • C#
  • Data Structures
  • JavaScript
  • Software Development
  • Java
  • Distributed Systems

About Company

Company Logo

JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world’s most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans ov ... View more

View Profile View Profile