At Amazon Selection and Catalog Systems (ASCS) our mission is to power the online buying experience for customers worldwide so they can find discover and buy any product they want. We innovate on behalf of our customers to ensure uniqueness and consistency of product identity and to infer relationships between products in Amazons Catalog to drive the selection gateway for the search and browse experiences on the website. Were solving a fundamental AI challenge: establishing product identity and relationships. Using Generative AI Visual Language Models (VLMs) and multimodal reasoning we determine what makes each product unique and how products relate to one another across Amazons catalog. The scale is staggering: billions of products petabytes of multimodal data millions of sellers dozens of languages and infinite product diversityfrom electronics to groceries to digital content.
The PRISM team operates at thefrontierof ML engineering. We build the serving infrastructure and ML platforms that bring large-scale GenAILLMs VLMs multimodal foundation modelsfrom research to production across Amazons catalog. Youll work with thelatest techniquesin optimized model serving distillation quantization distributed inference querying billion-scale vector indices and agentic systems that automate data curation training and evaluation end-to-end. Every system you build accelerates how fast we can experiment and how efficiently we can serve frontier models to hundreds of millions of customers daily.
We are looking for a Software Development Engineer at the intersection ofGenAI ML platforms and high-scale distributed systems. You will tackle some of the hardest problems in ML engineeringoptimizing LLM/VLM serving for latency and cost at massive scale designing agentic systems that autonomously reason over complex product data and building the automated pipelines that continuously integrate test and deploy models into production. Working alongside applied scientists your systems will serve hundreds of millions of customers daily and your engineering decisions will directly determine how fast we can innovate.
Key job responsibilities
* Build and optimize GenAI serving systems at massive scalecascaded inference with intelligent model routing optimized LLM/VLM serving pipelines and inference optimization techniques that achieve order-of-magnitude cost reductions while processing millions of daily submissions across billions of products
* Build ML platforms and agentic systemsthat power the full experiment-to-production lifecycleautomated training pipelines intelligent data curation continuous model improvement evaluation frameworks and CI/CD for all model workflowsdramatically accelerating how fast research ideas become production systems
* Architect reliable distributed systemsfrom scratch within Amazons ecosystemhigh availability low latency and operational excellence across hundreds of millions of daily transactions
* Partner with applied scientiststo productionize researchbridging the gap between experimental models and robust maintainable production infrastructure
* Generate intellectual propertythrough patents and publicationscontributing novel systems designs serving optimization techniques and agentic architectures to the broader ML engineering community
* Drive engineering excellencerigorous code reviews scalable design comprehensive testing and proactive operational ownership
* Mentor junior engineerson ML infrastructure distributed systems and operational best practicesraising the technical bar across the team
- 3 years of non-internship professional software development experience
- 2 years of non-internship design or architecture (design patterns reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- 3 years of full software development life cycle including coding standards code reviews source control management build processes testing and operations experience
- Bachelors degree in computer science or equivalent
- Experience building complex software systems that have been successfully delivered to customers or experience with Machine Learning and Large Language Model fundamentals including architecture training/inference lifecycles and optimization of model execution
- Experience with vLLM SGLang TensorRT or similar platforms in production environments or experience with Machine Learning and Large Language Model fundamentals including architecture training/inference lifecycles and optimization of model execution
- * Experience with large-scale data systems vector databases approximate nearest neighbor search
- * Experience building CI/CD pipelines workflow orchestration automation frameworks for ML workflows
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit
for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience qualifications and location. Amazon also offers comprehensive benefits including health insurance (medical dental vision prescription Basic Life & AD&D insurance and option for Supplemental life plans EAP Mental Health Support Medical Advice Line Flexible Spending Accounts Adoption and Surrogacy Reimbursement coverage) 401(k) matching paid time off and parental leave. Learn more about our benefits at WA Seattle - 143700.00 - 194400.00 USD annually