Site Reliability Engineer- Principal Epic

Quest Diagnostics

Not Interested
Bookmark
Report This Job

profile Job Location:

Secaucus, NJ - USA

profile Monthly Salary: $ 150000 - 175000
Posted on: 4 days ago
Vacancies: 1 Vacancy

Job Summary

Description

As a Principal Site Reliability Engineering you will be responsible for building a SRE practice monitoring and performance engineering best practices which will be aligned to our agile teams to help drive availability resiliency and stability of Quest products platform and services.

You are an engineering technical leader who has a passion for reliability and have a wide breath of experience. Ideally you will have had experience as a Site Reliability and Observability Engineer where you made significant improvements to the products/services/platforms and customer experience. You will also partner with architecture engineers security and operations to design and build reusable patterns to deploy reliable and resilient solutions.

You will also have responsibility to attract retain and grow top SRE engineering talent providing guidance and mentorship to team members.

You will bring empathy humility and a continuous learning mindset to every interaction. You are motivated to innovate and create to always do the right thing and to improve both what we build and how we build it.

Pay Range: $00 plus yearly bonus

Salary offers are based on a wide range of factors including relevant skills training experience education and where applicable certifications obtained. Market and organizational factors are also considered. Successful candidates may be eligible to receive annual performance bonus compensation.

Remote: This position supporting Epic can be remote if not located near a hub location within certain criteria.

This position is hybrid and will require 3 days on site at one of the following Quest sites: Secaucus NJ or Schaumburg IL.

Benefits Information: We are proud to offer best-in-class benefits and programs to support employees and their families in living healthy happy lives. Our pay and benefit plans have been designed to promote employee health in all respects physical financial and developmental. Depending on whether it is a part-time or full-time position some of the benefits offered may include:

  • Day 1 Medical supplemental health dental & vision for FT employees who work 30 hours
  • Best-in-class well-being programs
  • Annual no-cost health assessment program
  • Blueprint for Wellness
  • healthyMINDS mental health program
  • Vacation and Health/Flex Time
  • 6 Holidays plus 1 MyDay off
  • FinFit financial coaching and services
  • 401(k) pre-tax and/or Roth IRA with company match up to 5% after 12 months of service
  • Employee stock purchase plan
  • Life and disability insurance plus buy-up option
  • Flexible Spending Accounts Annual incentive plans
  • Matching gifts program
  • Education assistance through MyQuest for Education Career advancement opportunities and so much more!


Responsibilities
  • Experience in transforming an organization by designing and implementing SRE capabilities including monitoring performance and chaos engineering. You will set the strategy for overall Site Reliability Engineering (SRE)/Development alignment
  • Lead initiatives to implement service levels (SLIs SLOs SLAs) and error budgets. You will initiate influence and drive SRE within the organization and work with product and service teams to enable this model.
  • Provides guidelines/patterns and establishes proper metrics for building highly scalable reliable high performing systems
  • Strategizes best in class monitoring frameworks to accomplish end to end flow monitoring and meaningful alerting.
  • Coaches and mentors teams of monitoring performance and SRE engineers.
  • Proven ability to implement processes solutions and engineering capabilities at scale.
  • Prior experience in large scale digital technologies where uptime and continuous availability was core to the business.
  • Strong acumen of public cloud and / or private cloud implementation and application adoption
  • Strong understanding of Cloud API Event Driven and Microservices technologies for large scale environments.
  • Influences other leaders principals and engineers opening the discussion and adoption for implementing SRE best practices.
  • Builds relationships with other leaders and groups across the company providing understanding of SRE concepts and value.
  • Work with other team leads to identify improvements outside of SRE i.e. DevOps Quality etc.
  • Partners with the Director of SRE to build platform roadmaps frameworks and identify team/process improvements.
  • Technical owner of SRE tools with expertise and understanding of current and other widely used industry tools.
  • Evaluates other tools/solutions for SRE to ensure IT is being cost aware and tool egnostic.


Qualifications

Required WorkExperience:

  • 10 years of experience in developing enterprise software and proficiency in multiple languages e.g. Java and web technologies (Python Go Perl Ruby or shell scripting)
  • 5 years in implementing SRE solutions/practices.
  • 5 years in mentoring and coaching.
  • Expert knowledge of Dynatrace as product owner user and
  • Expert with a proven track record in delivering technology solutions and leading a high performing SRE team in automating manual work.
  • Expert knowledge of reliability and production management domains
  • Experience in public cloud environments (AWS/Azure/Google Cloud).
  • Experience in leading operations leveraging key event streaming messaging and DB services e.g. Casandra MQ/JMS/Kafka Aurora RDS Cloud SQL BigTable DynamoDB Cloud Spanner Kinesis Cloud Pub/Sub etc.
  • Experience in either SAFe agile Scrum or Kanban model
  • Expertise in DevSecOps practices and tools e.g. CI/CD Gitlab and any security scanning tools.
  • Experience with cloud-based technologies and tools especially in deployment monitoring and operations
  • Strong experience and technical skills in developing/managing APIs and Microservices
  • Expert practitioner in multiple technology domains may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm

Preferred Work Experience:

  • Experience with containerization (Docker Kubernetes)
  • Experience with Terraform and Ansible
  • Experience with SEIM
  • Experience with other APM tools
  • Healthcare industry experience

Physical and Mental Requirements:

  • Ability to sit for long periods of time

Knowledge:

  • Compliance requirements e.g. NIST CFR21 ISO GDPR HIPAA SOX
  • HL7 specifications
  • Integration Platform technologies (Mulesoft Informatica SnapLogic Jitterbit etc.)

Skills:

  • Self-driven
  • Problem solving
  • Adaptable
  • Negotiation
  • Prioritization

Education

  • Bachelors Degree Bachelors in computer engineering or something similar or equivalent work experience (Required)
  • Masters Degree Masters in computer engineering (Preferred)

Languages

  • English (Preferred)

Licenses and Certifications

  • AWS (Preferred)
  • Azure (Preferred)
  • GPC (Preferred)

Work Requirements

  • Travel Required up to 30%



Required Experience:

Staff IC

DescriptionAs a Principal Site Reliability Engineering you will be responsible for building a SRE practice monitoring and performance engineering best practices which will be aligned to our agile teams to help drive availability resiliency and stability of Quest products platform and services.You ar...
View more view more

About Company

Company Logo

Quest Diagnostics (NYSE: DGX) empowers people to take action to improve health outcomes. Derived from the world's largest database of clinical lab results, our diagnostic insights reveal new avenues to identify and treat disease, inspire healthy behaviors and improve health care mana ... View more

View Profile View Profile