Sr. Staff Site Reliability Engineer

Coupand

Not Interested
Bookmark
Report This Job

profile Job Location:

Seattle, OR - USA

profile Monthly Salary: Not Disclosed
Posted on: 15 days ago
Vacancies: 1 Vacancy

Job Summary

Job Overview:

Site Reliability Engineers (SREs) at Coupang is a mission-critical role which combines software and system engineering to build run and scale our complex large-scale ecommerce systems. As part of the Site Reliability Engineering team you will be responsible for ensuring all our customer facing services are healthy monitored automated and designed to scale. As SRE organization we take pride in handling operations as an engineering problem with automation first approach. You will use your background to build best in class infrastructure automation for areas such as Observability Incident management Disaster Recovery Load testing Capacity engineering and many this role you will work very closely with our product development teams from an early stage of design to all the way helping resolve any production incidents maintaining SLI/SLA bar for production services and influencing them with SRE principles and best practices. If you take pride in complete ownership have a passion for solving complex technical challenges for large scale distributed systems and demeanor to work and communicate effectively across team boundaries this is the role for you!

Key Responsibilities:

  • Serve as a primary point responsible for the platform reliability health and performance of all Coupang customer-facing services.
  • Gain deep knowledge of Coupang application workflow and dependencies.
  • Define and track key performance indicators (KPIs) and service-level objectives (SLOs) related to system availability performance and reliability.
  • Build world class incident management process and automation including fast incident remediation incidentoperational reviews andretrospectives.
  • Develop and implement best practices for creating Scaling and maintaining effective monitoring alerting and telemetry systems.
  • Build automation to execute regular Disaster Recovery testing Chaos testing and load testing to stay ahead of expected growth of Coupang services.
  • Work closely with product development teams to ensure theproducts are designed with scale and operability in mind.
  • Build right guardrails and automation for deploying production changes holding the reliability bar.
  • Participate in a 24x7 rotation for production issue escalations functions well in a fast-paced environment.
  • Communicate effectively with people at all levels of the organization.

Basic Qualifications:

  • Bachelors degree in computer science Engineering or a related technical field.
  • 8 years of industry experience building and operating large scale distributed systems

Preferred Qualifications:

  • Prior experience working with AI/ML large scale web-based Java architectures and JVM configuration.
  • Professional certifications in cloud platforms monitoring tools or related technologies.
  • Previous experience working on a large-scale GPU/Cloud Infrastructure platforms.
  • SLO/SLA management and implementation experience
  • Deep UNIX/Linux systems knowledge and administration background.
  • Demonstrated programming skills in one or more of: Python Java Golang Ruby.
  • Strong problem-solving and analytical skills spanning systems network (TCP/IP) and code with a focus on data-driven decision-making.
  • Experience with cloud-based GPU infrastructure including AWS Azure or Google Cloud Platform.
  • Strong understanding of DevOps and SRE practices including continuous integration continuous delivery and infrastructure as code (IaC).
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
  • Excellent communication and collaboration skills with the ability to work with teams across distinct functions and technical domains.
  • Knowledge of open telemetry observability ecosystem including metrics logging tracing and toolssuch as Prometheus Grafana Elastic Stack Datadog or New Relic.

Pay & Benefits

Our compensation reflects the cost of labor across several US geographic markets. At Coupang your base pay is one part of your total compensation.

The base pay for this position ranges from $176000/year in our lowest geographic market to $221000/yearin our highest geographic market. Pay is based on several factors including market location and may vary depending on job-related knowledge skills and experience.

General Description of All Benefits

  • Medical/Dental/Vision/Life AD&D insurance
  • Flexible Spending Accounts (FSA) & Health Savings Account (HSA)
  • Long-term/Short-term Disability
  • Employee Assistance Program (EAP) program
  • 401K Plan with Company Match
  • 18-21 days of the Paid Time Off (PTO) a year based on the tenure
  • 12 Public Holidays
  • Paid Parental leave
  • Pre-tax commuterbenefits
  • MTV - Free Electric Car Charging Station

General Description of Other Compensation

Other Compensation includes but is not limited to bonuses equity or other forms of compensation that wouldbe offered to the hired applicant in addition to their established salary range orwage scale.

Coupang is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to actual or perceived race (including traits historically associated with race including but not limited to hair texture and protective hair styles) color religion religious creed (including religious dress and grooming practices) sex or gender (including pregnancy childbirth breastfeeding and medical conditions related to pregnancy childbirth or breastfeeding) gender identity gender expression sexual orientation ancestry national origin (including language use restrictions) age (40 and over) physical or mental disability medical condition genetic information HIV/AIDS or Hepatitis C status family status (including but not limited to marital or domestic partnership status) military or veteran status use of a trained dog guide or service animal political activities or affiliations ancestry citizenship family and medical leave status status as a victim of any violent crime or any other characteristic or class protected by the laws or regulations in the locations where we operate.Coupang is also committed to providing a safe work environment for its employees and its consumers.If you need assistance and/or a reasonable accommodation in the application of recruiting process due to a disability please contact us at

Requisition: R0065794

Equal Opportunities for All

Coupang is an equal opportunity employer. Our unprecedented success could not be possible without the valuable inputs of our globally diverse team.


Required Experience:

Staff IC

Job Overview:Site Reliability Engineers (SREs) at Coupang is a mission-critical role which combines software and system engineering to build run and scale our complex large-scale ecommerce systems. As part of the Site Reliability Engineering team you will be responsible for ensuring all our customer...
View more view more

Key Skills

  • Kubernetes
  • FMEA
  • Continuous Improvement
  • Elasticsearch
  • Go
  • Root cause Analysis
  • Maximo
  • CMMS
  • Maintenance
  • Mechanical Engineering
  • Manufacturing
  • Troubleshooting

About Company

Company Logo

Join us to innovate. Rocket your career. Collaborate with teams across the globe. Find your role and learn more about our culture.

View Profile View Profile