drjobs Sr. Software Engineer (with Experience in SRE) العربية

Sr. Software Engineer (with Experience in SRE)

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Herzliya - Israel

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Description

Guide and shape the future of technology at a globally recognized firm driven by pride in ownership.

As a Senior Lead Site Reliability Engineer at JPMorgan Chase within Digital & Platform Services team youare the nonfunctional requirement owner and champion for the applications in your remit. You are a key influencer in your teams strategic planning driving continual improvement in customer experience resiliency security scalability monitoring instrumentation and automation of the software in your area. You act in a blameless datadriven manner and navigate difficult situations with composure and tact. To create and drive SRE culture mindsets and behavior by implementing industry best practices providing SRE teams with consulting/coaching and a data driven view of how SRE teams are performing and improve.

Job responsibilities

  • Demonstrates expertise in site reliability principles and demonstrates an understanding of the fine balance between features efficiency and stability
  • Act as the main point of contact during major incidents demonstrating technical expertise to quickly identify and solve issues while documenting and sharing knowledge within the organization. Evolves and debug critical components of applications and platforms
  • Effectively manage incidents and strive to enhance Mean Time to Recovery (MTTR) and other MTTx metrics through proactive monitoring and response strategies.
  • Implement and maintain observability tools including Dynatrace Grafana Splunk CloudWatch and Datadog to monitor and ensure system performance and reliability. To visualize these metrics and set up dashboards for realtime monitoring.
  • Deploy and oversee services within cloud environments with a strong preference for AWS ensuring optimal performance and costefficiency
  • Drive continuous improvement of reliability monitoring and alerting for missioncritical microservices while reducing toil through automation and creating reliable infrastructure and tooling to expedite feature development.
  • Develop and implement metrics for microservices define user journeys SLOs and error budgets and configure dashboards and alerts facilitating blameless postmortems to ensure permanent incident closure.
  • Engage with development teams throughout the software lifecycle to enhance reliability and scale design selfhealing and resiliency patterns and implement infrastructure configuration and network as code.
  • Collaborate with software engineers and teams to design and implement deployment approaches using automated CI/CD pipelines supporting the adoption of site reliability engineering best practices.
  • Implement and regularly testing DR strategies to ensure highest level of resilience and fault tolerance of the platform.
  • Maintain and promoting highquality written documentation of assets processes and runbooks that are used by the team in their daytoday operations.
  • Effectively negotiates with peers and executive partners to ensure optimal outcomes for alland drives the adoption of site reliability practices throughout the organization
  • Ensures your teams demonstrate site reliability best practices with the ability to demonstrate this empirically through stability and reliability metrics
  • Drives a culture of continual improvement and solicits realtime feedback to improve the dev and client experience
  • Ensures your team collaborates with other teams within your groups specialization and avoids duplication of work where possible
  • Follows blameless datadriven postmortem strategies and conducts regular team debriefs to enable learning from both successes and mistakes
  • Provides personalized coaching for entry to midlevel team members
  • Ensures your team documents and shares their knowledge and innovations via internal forums communities of practice guilds and conferences

Required qualifications capabilities and skills

  • Formal training or certification on software engineering concepts and 15 years applied experience.
  • Advanced knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platform
  • Proficiency in at least one programming language such as Python Go or Java/Spring Boot with expertise in designing coding testing and delivering software.
  • Proven experience deploying services to cloud platforms preferably AWS. Understanding and working experience in AWS applications and understanding of resiliency scalability observability monitoring etc.
  • Proficiency in continuous integration and continuous delivery tools (e.g. Jenkins GitLab Terraform etc.)
  • Experience with container and container orchestration (e.g. ECS Kubernetes Docker etc.)
  • Proficiency and experience in observability such as white and black box monitoring SLO alerting and comprehensive experience with observability tools specifically Dynatrace Grafana Splunk CloudWatch and Datadog.
  • Experience in incident management and improving MTTx metrics.
  • Handson experience with relational databases like Oracle or MySQL
  • Strong problemsolving skills with a high level of accountability.
  • Excellent written and verbal communication skills.
  • Ability to work independently and manage ambiguous scopes effectively.
  • Experience with troubleshooting common networking technologies and issues
  • Ability to identify and solve problems related to complex data structures and algorithms
  • .Ability to expand and collaborate across different levels and stakeholder groups

Preferred qualifications capabilities and skills

  • Experience in banking / financial domain is preferred.
  • Experience inInfrastructure Architecturedesigns.
  • Familiarity with GCP AWS ECS EKS & Terraform
  • Expertise in networking concepts (TCP/IP DNS TLS HTTPS) and CDN technologies





Required Experience:

Senior IC

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.