drjobs Lead Site Reliability Engineer

Lead Site Reliability Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Columbus - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Description

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Lead Site Reliability Engineer at JPMorgan Chase within the ENTERPRISE TECHNOLOGY INFRASTRUCTURE PLATFORMS you will hold a pivotal role in your team. Your extensive technical knowledge will be utilized to overcome both technical and business challenges. Your duties will encompass leading resiliency design reviews simplifying complex issues into manageable tasks for other engineers acting as a technical lead for medium to largesized projects and providing guidance and mentorship to your team members.

Job responsibilities

  • Champion a culture of site reliability exerting technical influence throughout your team and the organization.
  • Lead initiatives to improve service levels using datadriven analytics enhancing the reliability and stability of applications and platforms.
  • Collaborate with team members to identify comprehensive service level indicators and work with stakeholders to establish service level objectives and error budgets.
  • Demonstrate highlevel expertise in AWS distributed systems and data warehouse domains proactively resolving technologyrelated bottlenecks.
  • Act as the primary point of contact during major incidents showcasing the ability to quickly identify and resolve issues to prevent financial losses.
  • Document and share knowledge within the organization through internal forums and communities of practice.

Required Qualifications Capabilities and Skills:

  • Formal training or certification in site reliability engineering concepts with 5 years of applied experience.
  • Deep proficiency in reliability scalability performance security enterprise system architecture toil reduction and other site reliability best practices.
  • Strong infrastructure experience including designing implementing and maintaining scalable and resilient systems.
  • Proficiency in programming language such as Python.
  • Extensive knowledge of software applications and technical processes with emerging expertise in one or more technical disciplines.
  • Proficiency in observability including white and black box monitoring SLO alerting and telemetry collection using tools like Grafana Dynatrace Prometheus Datadog Splunk etc.
  • Experience with continuous integration and continuous delivery tools (e.g. Jenkins GitLab Terraform etc.).
  • Experience with cloud computing using AWS (EC2 EMR Athena Glue Redshift etc.) and container orchestration (e.g. ECS Kubernetes Docker etc.).
  • Experience troubleshooting common networking technologies and issues.

Preferred Qualifications Capabilities and Skills:

  • Ability to identify and solve problems related to complex data structures and algorithms.
  • Selfmotivated and a lifelong learner eager to embrace and master emerging technologies.
  • Ability to expand and collaborate across different levels and stakeholder groups.

#LIRB3



Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.