At eBay were more than a global ecommerce leader were changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. Were committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.
Our customers are our compass authenticity thrives bold ideas are welcome and everyone can bring their unique selves to work every day. Were in this together sustaining the future of our customers our company and our planet.
Join a team of passionate thinkers innovators and dreamers and help us connect people and build communities to create economic opportunity for all.
About the team and the role:
eBays AI Platform team is building the next generation of agentic and inference technologies that power AI experiences for hundreds of millions of users worldwide. We are seeking an ML Interence Router Engineer to design and build a highly scalable low-latency inference gateway capable of supporting billions of daily requests.
This role sits at the core of eBays AI infrastructuredeveloping distributed fault-tolerant systems that orchestrate requests across diverse large language models (LLMs) and ensure high reliability efficiency and cost-effectiveness. If you are passionate about large-scale systems engineering love solving hard performance problems and want to shape the backbone of AI at global scale wed love to hear from you.
What you will accomplish:
Design and build an LLM inference gateway that scales to billions of daily requests with millisecond-level latency.
Develop intelligent request routing load balancing and fallback mechanisms across heterogeneous LLM backends (internal and external).
Optimize throughput cost and reliability of inference workloads in multi-tenant environments.
Collaborate with platform research and product teams to integrate new models and agentic capabilities into the gateway.
Implement observability tracing and autoscaling for inference traffic across Kubernetes-based clusters.
Conduct design and code reviews to ensure high standards in distributed systems architecture.
Stay current with advances in LLM serving inference acceleration and model APIs to continuously evolve the platform.
What you will bring:
10 years of experience building large-scale fault-tolerant high-performance distributed systems.
Strong programming skills in one or more of Java Go Rust or C (Java preferred for gateway services).
Deep understanding of networking concurrency memory management and performance tuning in production systems.
Proven experience designing and operating low-latency APIs at very large scale (10M QPS).
Hands-on experience with Kubernetes service meshes and container orchestration at scale.
Strong background in cloud infrastructure (AWS GCP Azure) and distributed system design.
Bonus Skills:
Experience with inference serving frameworks (vLLM Triton TensorRT-LLM FasterTransformer DeepSpeed-MII or similar).
Familiarity with LLM tokenization batching and scheduling strategies.
Background in microservice API gateway design (rate limiting routing policies authentication).
Experience with real-time monitoring tracing and autoscaling of high-throughput systems.
Contributions to open-source distributed systems or ML serving projects.
#LI-Hybrid
The base pay range for this position is expected in the range below:
$132000 - $222100Base pay offered may vary depending on multiple individualized factors including location skills and experience. The total compensation package for this position may also include other elements including a target bonus and restricted stock units (as applicable) in addition to a full range of medical financial and/or other benefits (including 401(k) eligibility and various paid time off benefits such as PTO and parental leave). Details of participation in these benefit plans will be provided if an employee receives an offer of employment.
If hired employees will be in an at-will position and the Company reserves the right to modify base salary (as well as any other discretionary payment or compensation program) at any time including for reasons related to individual performance Company or individual department/team performance and market factors.
Please see the Talent Privacy Noticefor information regarding how eBay handles your personal data collected when you use the eBay Careers website or apply for a job with eBay.
eBay is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race color religion national origin sex sexual orientation gender identity veteran status and disability or other legally protected you have a need that requires accommodation please contact us at. We will make every effort to respond to your request for accommodation as soon as possible. View our accessibility statement to learn more about eBays commitment to ensuring digital accessibility for people with disabilities. It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.
The eBay Jobs website uses cookies to enhance your experience. By continuing to browse the site you agree to our use of cookies. Visit our Privacy Center for more information.
Founded in 1995 in San Jose, Calif., eBay (NASDAQ: EBAY) is where the world goes to shop, sell and give. Whether you’re buying new or used, common or luxurious, trendy or rare – if it exists in the world, it’s probably for sale on eBay. Our great value and unique selection help every ... View more