ML Platform @ Roblox today supports hundreds of ML use cases and billions of inferences per day across Discovery Safety Engine and much more. As a machine learning engineer on ML Platform you will be responsible for digging deep into model internals to optimize performance for both training and inference. We are looking for accomplished engineers to help us maximize performance of our platform.
You Will:
- Optimize machine learning models for performance on GPU architectures focusing on both training and inference workflows.
- Conduct lowlevel performance profiling analysis to identify bottlenecks in existing machine learning pipelines and propose actionable improvements.
- Contribute to the development of best practices and tooling for model optimization and deployment.
- Collaborate with crossfunctional teams including data scientists and software engineers to integrate and deploy optimized models into production environments.
- Partner across organizations to build tooling interfaces and visualizations that make the a delight to use.
You Have:
- 4 years of professional experience and a tool chest of system design experience upon which to draw to build performant systems for all of Roblox.
- Significant experience debugging GPUs reading GPU profiles debugging Xid errors etc.
- Ideally have experience with model optimization techniques for LLMs such as speculative decoding continuous batching quantization etc.
- Bachelors degree in Computer Science Computer Engineering Data Science or a similar technical field.
You Are:
- Proficient in advanced tools and frameworks (e.g. CUDA Triton TensorRT) to enhance model speed and reduce latency.
- A performance nut; you love pushing the limits of whats possible whether its squeezing every last ounce of efficiency from a GPU finetuning algorithms for peak speed or innovating new techniques to enhance model performance
- A generalization advocate: youre passionate about building tools and frameworks that consistently deliver improvements in model performance.
- Passionate about supporting internal partners (data scientists and ML Engineers) to meet and understand their needs.
Required Experience:
Staff IC