AI Performance Engineer

Graphcore

Not Interested
Bookmark
Report This Job

profile Job Location:

Milpitas, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 6 days ago
Vacancies: 1 Vacancy

Job Summary

About us

Graphcore is one of the worlds leading innovators in Artificial Intelligence compute.

It is developing hardware software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.

As part of the SoftBank Group Graphcore is a member of an elite family of companies responsible for some of the worlds most transformative technologies. Together they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.

Graphcores teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists silicon designers software engineers and systems architects Graphcore enjoys a culture of continuous learning and constant innovation.

Job Summary

Graphcores AI/ML training and inference infrastructure is rapidly scaling to meet the growing demands of AI workloads across mobile edge and datacenter environments. This role focuses on optimizing performance across ARM-based architectures and large-scale distributed systems ensuring efficiency scalability and reliability across the full hardware-software stack.

The Team

The System Engineering Performance team architects and optimizes high-performance infrastructure for large-scale datacenter deployments. The team works across hardware software networking and system architecture to deliver cutting-edge AI solutions and ensure optimal system performance at scale.

Responsibilities and Duties

  • Analyze ML models compute and memory requirements using roofline analysis and simulations
  • Collaborate across hardware and software teams to optimize large-scale AI workloads
  • Benchmark monitor and troubleshoot system performance across distributed systems
  • Optimize communication stacks including MPI NCCL UCX RDMA and networking fabrics
  • Profile and optimize AI workloads focusing on performance bottlenecks
  • Develop high-quality ARM-compatible code and documentation

Candidate Profile

Essential:

  • BS/MS in Computer Science Electrical Engineering or related field
  • Experience with distributed systems and communication libraries (MPI NCCL UCX libfabric)
  • Strong programming skills in C and Python
  • Experience profiling and optimizing HPC or AI/ML workloads
  • Familiarity with ML benchmarks such as MLPerf

Desirable:

  • Experience with GPUs or accelerated computing architectures
  • Knowledge of HPC networking and interconnect technologies (InfiniBand RoCE)
  • Familiarity with ML frameworks such as PyTorch or TensorFlow
  • Understanding of ARM architectures and toolchains
  • Strong debugging profiling and performance optimization skills

Required Experience:

IC

About usGraphcore is one of the worlds leading innovators in Artificial Intelligence compute.It is developing hardware software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.As part of the S...
View more view more

About Company

Company Logo

Python, Javascript, MLOps

View Profile View Profile