Create optimized implementations of ML workloads on Apple silicon including Neural Engine GPU and CPU. Collaborate with IP and SoC architecture teams to develop performance models and simulations of future hardware. Conduct performance studies to inform and validate architecture decisions. Collaborate with system team to create high level performance models of emerging ML techniques and analyze system architecture trade-offs.
Bachelors degree
Ability to program in C/C and/or Python
Knowledge of computer architecture fundamentals
Domain knowledge in at least one hardware IP: ML HW accelerators or processing units such as GPU image/video CPUs or similar
MS or PhD in EE/CE/CS or related field or 3 years of relevant experience
Experience in efficient implementation of machine learning algorithms
Experience in creating system or IP performance models/simulations
Verbal and written communication skills for collaborating with partner teams
Familiarity with deep learning frameworks such as PyTorch
Ability to prototype and benchmark algorithms on CPU/GPU/Neural Engine analyze performance metrics and create high level complexity models
Ability to develop hardware accelerator performance and bit accurate models
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.