We work on developing prototyping and productizing state of the art algorithms for neural network model compression. Our algorithms are implemented using PyTorch and optimizations are geared towards efficient deployment via Core ML. We optimize models across domains including NLP vision text and image generative models etc. Our APIs are available to Core ML users both internal to Apple and external developers via the Core ML Tools optimization sub Responsibilities: - Implement latest algorithms from research papers for model compression in the optimization library. Apply these to the models critical for deployment and test on various architectures such as diffusion models large language models etc. - Set up and debug training jobs datasets evaluation performance benchmarking pipelines. Applying training time and post training compression techniques. Ability to ramp up quickly on new training code bases and run experiments. - Understanding HW capabilities and incorporating those in optimization algorithm design / enhancement.- Keep up with the latest AI research and present recent papers in the field of model compression to the team.- Collaborate with researchers hardware and software engineers to co-develop and discover ideas and optimizations for critical models to be deployed on specific hardware- Run detailed experiments and ablation studies to profile algorithms on various models tasks across different model sizes.- Improving model optimization documentation writing tutorials and guides- Self prioritize and adjust to changing priorities and asks
Bachelor of Science Computers or Engineering
5 years of industry and/or research experience
Highly proficient in Python programming
Proficiency in at least one ML authoring framework such as PyTorch TensorFlow JAX MLX
Experience in the area of model compression and quantization techniques specially in one of the optimization libraries for an ML framework (e.g. ).
Demonstrated ability to design user friendly and maintainable APIs
A deep understanding in the research area of model compression and quantization techniques.
Experience in training fine tuning and optimizing neural network models
Primary contributor to a model optimization/compression library.
Good communication skills including ability to communicate with cross-functional audiences
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.