Want to help make the next generation of Machine Learning in the cloud possible Do you have a laser focus on performance in your code We want to talk to you!
We own the user-space software that makes the Elastic Fabric Adapter (EFA) network card work for Machine Learning (ML) and High-Performance Computing (HPC) customers on AWS. Across multiple projects written in C our team enables customers to network thousands of GPU and CPU instance types to handle the toughest clustered workloads. Be a part of a dynamic fast-paced group that has a big impact every day on the hottest companies doing AI and HPC today.
Key job responsibilities You will write the highest-performing code in C for multiple open source projects supporting EFA such as Libfabric and Open MPI. You will work with multiple teams in the stack to invent new APIs for the latest concepts in networking in the cloud. Dive deep into how your customers are doing collectives and messaging at high bandwidth and low latency. Provide expert-level support to some of the biggest names in AI in the world.
A day in the life Start from the needs of your customer and invent new ways of cutting the occupancy of the software stack for EFA. Get your peers and stakeholders on board with excellent written designs. Write comprehensive tests to drive the development of new features and guard against regressions. Work with our ML Infrastructure team to see your products perform on 100s and 1000s of top-end machine clusters.
About the team We are a fast-paced team that owns the user-space software stack for EFA. As part of Annapurna Labs in AWS we are very nimble paying careful attention to what the AI industry is going to try next and having our products ready. We focus heavily on automation confining operations to the most interesting problems as customers continuously experiment with what our network can do. Our team is a place of growth concentrating on your career and goals and motivating you to achieve your highest potential.
- 3 years of non-internship professional software development experience - 2 years of non-internship design or architecture (design patterns reliability and scaling) of new and existing systems experience - 3 years of professional experience programming high-performance software in C ideally as part of an Open Source project
- 3 years of full software development life cycle including coding standards code reviews source control management build processes testing and operations experience - Bachelors degree in computer science or equivalent - Experience developing in a network software stack with a focus on cutting occupancy to the barest minimum number of instructions
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status disability or other legally protected status.
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.