Staff Software Engineer AI Traffic & Inference Infrastructure
Job Summary
Company Introduction
We exist to wow our customers. We know were doing the right thing when we hear our customers say How did we ever live without Coupang Born out of an obsession to make shopping eating and living easier than ever were collectively disrupting the multi-billion-dollar e-commerce industry from the ground up. We are one of the fastest-growing e-commerce companies that established an unparalleled reputation for being a dominant and reliable force in South Korean commerce.
We are proud to have the best of both worlds a startup culture with the resources of a large global public company. This fuels us to continue our growth and launch new services at the speed we have been since our inception. We are all entrepreneurs surrounded by opportunities to drive new initiatives and innovations. At our core we are bold and ambitious people that like to get our hands dirty and make a hands-on impact. At Coupang you will see yourself your colleagues your team and the company grow every day.
Our mission to build the future of commerce is real. We push the boundaries of whats possible to solve problems and break traditional Coupang now to create an epic experience in this always-on high-tech and hyper-connected world.
Role Overview
As a Staff Engineer on our Coupang intelligent Cloud Infrastructure team you will design and scale the intelligent nervous system of our CIC Cloud AI platform. You wont just be moving packets; youll be building the orchestration and routing layers that ensure our LLMs and foundation models are highly available low-latency and cost-efficient. You will own the end-to-end lifecycle of traffic management from global load balancing to hardware-aware request routing across thousands of accelerators.
What You Will Do
Intelligent Routing:Design and implement sophisticated load-balancing algorithms tailored for AI workloads(training inference) optimizing request distribution based on model availability and accelerator health.
Inference Orchestration:Architect and evolve our inference infrastructure to support seamless model deploymentauto-scaling and multi-AZfailover.
Performance Engineering:Drive initiatives to minimize tail latency (P95 /P99) and maximize throughput using advanced batching caching and streaming token delivery techniques.
Fleet Automation:Build robust infrastructure-as-code and CI/CD pipelines to manage dynamic compute fleets ensuring they automatically scale to meet production and research demands.
Observability & Optimization:Leverage deep telemetry data to tune system performance and hardware-agnostic scheduling across diverse GPU/TPU environments.
Technical Leadership: Lead cross-functional initiatives across infrastructure and SWteam ML teams providing mentorship andsetting upthe long-term technical roadmap for traffic management.
Basic Qualifications
- Education:Bachelors orMasters degree in Computer Science Engineering ora relatedtechnical field.
Experience:812 years of progressive software engineering experience with a heavy emphasis on distributed systems cloud-native architectures or platform operations.
Programming:StrongproficiencyinGoorPython with a deep understanding of networked systems and performance optimization.
Orchestration:Expert-level knowledge ofKubernetesinternals (scheduling controllers) and containerization ecosystems.
Traffic Management:Proven experience with load balancing service mesh and request routing at scale.
Operational Excellence:A strong ownership mindset witha track recordofmaintainingmission-critical high-availability systems in production.
Preferred Qualifications
AI/ML Domain Knowledge:Prior experience building infrastructure specifically for LLM inference or large-scale training clusters.
Low-Level Optimization:Familiaritywithinference including mixed precisionkernel tuning or custom hardware accelerators.
Public/Private Cloud:Experience managing hybrid-cloud or multi-AZdeployments across AWS Azure or GCP.
Compliance:Experienceoperatingin regulated environments with strict security and compliance requirements.
Type of work model
- Hybrid
Details to consider
- Those eligible for employment protection (recipients of veterans benefits the disabled etc.) may receive preferential treatment for employment in accordance with applicable laws.
Privacy Notice
- Your personal information will be collected and managed by Coupang as stated in the Application Privacy Notice located below. Experience:
Staff IC
About Company
Join us to innovate. Rocket your career. Collaborate with teams across the globe. Find your role and learn more about our culture.