Site Reliability Engineer, Core Streaming (Remote United States)
San Francisco, CA - USA
Job Summary
Summary
Do you want to help build and operate scalable resilient systems that power Yelps critical business functions Our Site Reliability Engineers (SREs) ensure our services remain fast reliable and available even as we grow and requirements evolve. As an SRE specializing in Kafka youll play a pivotal role in managing our real-time data streaming infrastructure and supporting event-driven applications at scale.
We work at the intersection of software development and distributed systems owning the backbone of our organizations streaming architecture. As a Kafka SRE youll take on challenges only found at the kind of scale that supports global always-on applications. Yelp processes massive amounts of user data dailyover 300 million business reviews 100000 photo uploads and countless check-ins. Maintaining sub-minute data freshness with such high volume presents an exciting technical problem and a very interesting area to work in.
Youll drive best practices in automation and self-service knowing that deploying or upgrading data streaming infrastructure should be as effortless as a git commit and code review away.
This opportunity is fully remote and does not require you to be located in any particular state within the US. We welcome applicants from throughout the US. Wed love to have you apply even if you dont feel you meet every single requirement in this posting. At Yelp were looking for great people not just those who simply check off all the boxes.
What youll do:
Design deploy and maintain large-scale Kafka event streaming infrastructure across hybrid and multi-cloud environments.
Collaborate with engineers to enable new features ensure data pipeline reliability and advise on best practices for real-time data processing.
Execute and automate Kafka cluster upgrades migrations and major version rollouts with minimal impact to critical services.
Build or enhance self-service capabilities and automation for cluster operations scaling and incident recovery.
Troubleshoot complex issues affecting data flow performance or stability and drive root cause analyses.
Participate in on-call rotations. Our geographically distributed SRE teams use a follow-the-sun model so no one needs to be on-call 24 hours a day!
What it takes to succeed:
Strong hands-on experience designing and implementing large-scale Kafka event streaming capabilities in production across hybrid or multi-cloud and Linux environments including upgrades and migrations between platforms or versions.
In-depth knowledge of event streaming/data-in-motion design principles architecture and operational nuances.
Programming proficiency in Java Python or similar modern languages for tooling integration and automation.
Familiarity with Kafka Client APIs (Producer Consumer Streams) as well as sizing and capacity planning for high-throughput clusters.
Experience designing and optimizing real-time data streaming solutions with technologies like Apache Flink.
Knowledge of automating infrastructure and operational tasks (configuration management IaC scripting or related).
Problem-solving mindset with an eagerness to learn take initiative and advocate for infrastructure best practices in a fast-paced environment.
- A Bachelors Degree or an equivalent work experience is required.
What youll get:
There are a variety of factors that go into determining a compensation range including but not limited to external market benchmark data and years of experience. Based on the anticipated level of experience that we are seeking we expect the compensation range for this role to be between $141000 and $216000 The actual compensation offered may be influenced by a variety of factors including the candidates experience and skill set.
There may be flexibility with the range included in this posting should a candidate be leveled higher or lower than the posted range.
- This opportunity has the option to be fully remote in all locations across the US.
- You can find more information about Yelps five star benefits here!
Closing
Required Experience:
IC
About Company
Be part of an empowering mission to connect people to great local businesses with Yelp! Search for open positions on our career site today.