AI Software Engineer
Seattle, OR - USA
Job Summary
The Team
You will join a dynamicAI Infrastructure teamfocused on enabling high-performance AI across Zooms products and services. The team builds the core systems that support model training deployment and inference at scale driving innovation in areas such as real-time communication computer vision and natural language understanding.
What You Can Expect
Youll design implement and own the inference systems that serve Zooms AI models at production scale across real-time communication vision and language workloads. Youll be hands-on with kernel-level optimisation inference framework internals and production serving infrastructure working closely with research and platform teams to push the boundary on latency throughput and cost.
Responsibilities
Design and build high-performance inference serving systems for large-scale transformer and multimodal models (including 100B and MoE architectures)
Implement and tune inference optimisations: speculative decoding continuous batching KV cache management prefill/decode disaggregation and quantisation (INT4/INT8/FP8)
Contribute to and customise inference frameworks (vLLM TensorRT-LLM SGLang or equivalent) for Zooms production requirements
Write and profile CUDA kernels and custom ops where framework-level optimisation is insufficient
Own end-to-end deployment: from model packaging and serving API design to latency SLO monitoring and incident response
Partner with research to translate model architecture changes into inference-efficient implementations
Drive technical design and set the bar for inference eng practices across the team
What Were Looking For
5 years of software engineering experience with significant time spent on inference systems or ML infrastructure at production depth
Hands-on experience with at least one major inference framework: vLLM TensorRT-LLM SGLang or ONNX Runtime (serving not just export)
GPU programming experience: CUDA kernel development memory optimisation profiling with Nsight or equivalent
Production experience serving LLMs or large vision models youve owned latency SLOs debugged throughput regressions and shipped optimisations that moved the needle
Depth in at least two of: speculative decoding continuous batching KV cache design quantisation pipelines prefill/decode disaggregation
Strong systems instincts in Python and C; ability to read and modify framework internals
Preferred:
Experience with MoE models or 100B parameter deployments
Familiarity with disaggregated serving architectures or multi-node inference
Background in compiler-level optimisation (XLA Triton or similar)
Salary Range or On Target Earnings:
Minimum:
$151800.00Maximum:
$332200.00In addition to the base salary and/or OTE listed Zoom has a Total Direct Compensation philosophy that takes into consideration; base salary bonus and equity value.
Note: Starting pay will be based on a number of factors and commensurate with qualifications & experience.
We also have a location based compensation structure; there may be a different range for candidates in this and other locations.
Ways of Working
Our structured hybrid approach is centered around our offices and remote work environments. The work style of each role Hybrid Remote or In-Person is indicated in the job description/posting.
Benefits
As part of our award-winning workplace culture and commitment to delivering happiness our benefits program offers a variety of perks benefits and options to help employees maintain their physical mental emotional and financial health; support work-life balance; and contribute to their community in meaningful ways. Click Learn for more information.
About Us
Zoomies help people stay connected so they can get more done together. We set out to build the best collaboration platform for the enterprise and today help people communicate better with products like Zoom Contact Center Zoom Phone Zoom Events Zoom Apps Zoom Rooms and Zoom Webinars.
Were problem-solvers working at a fast pace to design solutions with our customers and users in mind. Find room to grow with opportunities to stretch your skills and advance your career in a collaborative growth-focused environment.
Our Commitment
At Zoom we believe great work happens when people feel supported and empowered. Were committed to fair hiring practices that ensure every candidate is evaluated based on skills experience and potential. If you require an accommodation during the hiring process let us knowwere here to support you at every step.
We welcome people of different backgrounds experiences abilities and perspectives including qualified applicants with arrest and conviction records and any qualified applicants requiring reasonable accommodations in accordance with the law.
If you need assistance navigating the interview process due to a medical disability please submit an Accommodations Request Form and someone from our team will reach out soon. This form is solely for applicants who require an accommodation due to a qualifying medical disability. Non-accommodation-related requests such as application follow-ups or technical issues will not be addressed.
Think of this opportunity as a marathon not a sprint! Were building a strong team at Zoom and were looking for talented individuals to join us for the long haul. No need to rush your application take your time to ensure its a good fit for your career goals. We continuously review applications so submit yours whenever youre ready to take the next step.
Required Experience:
IC
Key Skills
About Company
Zoom unifies cloud video conferencing, simple online meetings, and group messaging into one easy-to-use platform. Zoom is growing at an explosive pace by every measure – revenues, people, innovation, and customers. Led by Eric S. Yuan, the #1 ranked CEO on Glassdoor, our unique cultur ... View more