Solutions Architect AI ML Training & GPU infra
Den Bosch - Netherlands
Job Summary
AI/ML Solutions Architect Distributed Training & GPU Infrastructure
Company
Join a fast-moving AI infrastructure team working on the cutting edge of large-scale ML workloads. This role is ideal for engineers who enjoy solving deep technical challenges in distributed training multi-GPU systems and scalable AI inference infrastructure. You will work directly with AI-focused clients helping them get the most out of modern GPUs (H100 B200 etc.) and ML frameworks such as PyTorch (and JAX in some environments).
Team & Responsibilities
Work alongside senior AI and infrastructure engineers building large-scale GPU platforms. As part of the customer solutions team you will:
Design and validate production-grade distributed training (primary) and largescale inference architectures on large GPU clusters typically tens to thousands of GPUs
Work hands-on with customers to debug optimize and scale ML workloads across multi-node GPU environments
Act as a technical authority on GPU performance networking and schedulers making trade-offs at scale and translating customer needs into concrete platform requirements
Collaborate closely with engineering product and R&D to influence roadmap decisions based on real-world ML workloads
This is a hands-on technical role; you are expected to work directly in customer environments not only advise at a high level
Required skills and experience
Hands-on experience designing and operating production-grade multi-node GPU workloads for training or inference
Strong background in distributed deep learning (PyTorch Distributed DeepSpeed) on GPU clusters
Deep understanding of GPU architecture and interconnects (H100/A100 class NVLink InfiniBand)
Experience with Kubernetes or Slurm and performance tuning using GPU profiling and monitoring tools
This role is not a fit if your experience is limited to single-node training high-level AI strategy or non-production research environments. We are looking for engineers and architects who thrive at the intersection of AI workloads and large-scale infrastructure.
Whats offered
Location: Remote from anywhere in Europe
Total compensation up to EU 300k (base variable) depending on level and experience
Key Skills
About Company
We focus on job opportunities in The Netherlands for IT and engineering professionals. We share relevant tips and tricks with jobseekers and we can support employers with regards to relocation, work permit rules, 30% ruling et cetera. We value transparancy, honesty and a no-nonsense a ... View more