GPU Acceleration Engineer - Calculation Engine
Massively accelerate the sparse calculation engine of a UK SaaS B2B - Enterprise Planning & Analytics company by porting critical algorithms from Rust/C to GPU (CUDA). Transform currently impossible calculations (requiring thousands of years of CPU time) into operations achievable in minutes.
UK SaaS B2B - Enterprise Planning & Analytics company manages planning models reaching 64 quadrillion cells with billions of time periods. Our Hyperblock/Polaris engine is currently limited by:
Legacy CPU architecture (Java/Rust/C)
Memory constraints on massive sparse structures
Prohibitive calculation times on complex scenarios
Objective: Achieve performance gains of 100x to 1000x via GPU offloading.
Port existing Rust/C algorithms to CUDA/GPU
Identify and extract critical calculation paths to accelerate
Optimize sparse matrix operations for GPU architecture
Develop performant Rust CUDA wrappers
Benchmark and validate performance gains
Design GPU memory management strategies for massive datasets
Implement efficient patterns for sparse structures
Optimize CPU GPU memory transfers
Manage GPU memory limitations on large-scale calculations
Work with engineering team on integration
Document GPU porting patterns
Participate in code reviews and design reviews
Train the team on GPU best practices
CUDA - Primary GPU development
Rust - Source language for algorithms to port
C - Legacy components and CUDA interoperability
(Java - platform context no dev required)
NVIDIA CUDA (toolkit libraries: cuBLAS cuSPARSE)
Rust (ownership model unsafe blocks FFI)
GPU Programming (kernels memory hierarchy optimization)
Sparse Matrix Operations (compression storage formats)
Profiling Tools (nvprof Nsight perf)
GPU & CUDA (Essential)
Significant CUDA programming experience (3 years)
Mastery of GPU kernel optimization
Deep knowledge of NVIDIA GPU architecture (memory hierarchy warps occupancy)
Experience with sparse calculations on GPU (cuSPARSE or equivalent)
Rust (Essential)
Production Rust development
Mastery of ownership and borrowing system
Experience with unsafe Rust and FFI (Foreign Function Interface)
Ability to analyze and refactor existing Rust code
C (Required)
Modern C (C11/14/17)
C CUDA integration
Templates and metaprogramming (asset)
Algorithms (Required)
Data structures for scientific computing
Sparse matrix algorithms (CSR COO etc.)
Performance optimization and profiling
Parallelization and concurrency concepts
Documented CPU GPU porting projects
HPC experience (supercomputers GPU clusters)
Memory optimization for large-scale datasets
Scientific computing or numerical simulation
Rust interop with other languages (C/C/Python)
100% remote (France/Europe base preferred)
Occasional travel to London
Frequency: 1 week/month for team sprints
Project kickoff key reviews
Intensive collaboration sessions
Start date: As soon as possible
GECI International est un spécialiste de la Technologie et du Digital. Depuis son origine en 1980, le Groupe innove pour concevoir et développer des solutions, produits et services intelligents pour les secteurs de la Recherche, de l’Industrie et des Services.