EC2 Nitro drives the planets largest fastest growing and most feature-rich compute cloud. Nitro is AWS ground-up design for virtualization at global scale built on a fully custom stack of hardware firmware and applications. Nitro has enabled EC2 to support Intel AMD and Amazons custom silicon - the Graviton processor family - while raising the industry bar for security and performance across our product line.
We integrate hardware firmware application software and services to deliver new virtualized and bare-metal compute platforms for companies from startups through the Fortune 500. We are looking for an experienced leader to drive software development and scaling for new EC2 compute this role you will work with a broad and deep group of technical teams that develop hardware firmware systems and application software.
The ideal candidate is expected to have solid understanding of computer science fundamentals and expertise in CC or Rust development in a Linux environment. Experience with Linux package management version control systems automated build processes and software unit testing are -depth knowledge of ML frameworks and cluster management is highly preferred.
Key job responsibilities
- Design and develop innovative technologies that power the infrastructure supporting machine learning workloads
- Lead technical projects establishing EC2 as the definitive source for ML performance best practices across diverse applications including LLMs multimodal systems and emerging model architectures
- Develop and maintain comprehensive regression testing systems that validate performance across major component releases including frameworks firmware drivers and networking infrastructure
- Collaborate with hardware engineering teams to influence future platform designs based on performance insights gathered from state-of-the-art research and customer workloads
- Build customer relationships by investigating complex performance challenges developing solutions and publishing actionable best practices through multiple channels
About the team
The EC2 Nitro Machine Learning Systems team is responsible for development operations and maintenance of scale-out machine learning platforms used for training and inference workloads. We build and optimize the infrastructure that powers some of the most computationally intensive AI/ML workloads in the cloud. Our team is passionate about creating reliable high-performance systems that enable customers to push the boundaries of whats possible with machine learning.
Working with us means having the opportunity to influence the future of supercomputing in the cloud while solving complex technical challenges at massive scale. We collaborate closely with customers and internal teams to continuously improve our platforms and deliver innovations that accelerate machine learning workflows.
- 5 years of non-internship professional software development experience
- 5 years of programming with at least one software programming language experience
- 5 years of leading design or architecture (design patterns reliability and scaling) of new and existing systems experience
- Experience as a mentor tech lead or leading an engineering team
- 5 years of full software development life cycle including coding standards code reviews source control management build processes testing and operations experience
- Bachelors degree in computer science or equivalent
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process including support for the interview or onboarding process please visit
for more information. If the country/region youre applying in isnt listed please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience qualifications and location. Amazon also offers comprehensive benefits including health insurance (medical dental vision prescription Basic Life & AD&D insurance and option for Supplemental life plans EAP Mental Health Support Medical Advice Line Flexible Spending Accounts Adoption and Surrogacy Reimbursement coverage) 401(k) matching paid time off and parental leave. Learn more about our benefits at WA Seattle - 168100.00 - 227400.00 USD annually