Staff GPU Systems Engineer, Space Computing
Long Beach, NY - USA
Job Summary
At Relativity Space were building rockets to serve todays needs and tomorrows breakthroughs. OurTerran R vehicle will deliver customer payloads to orbit meeting the growing demand for launchcapacity. But thats just the start. Achieving commercial success with Terran R will unlock newopportunities to advance science exploration and innovation pioneering progress that reaches beyondthe known.
Joining Relativity means becoming part of something where autonomy ownership and impact exist at every level. Here youre not just executing tasks; youre solving problems that havent been solved before helping develop a rocket a factory and a business from the ground up. Whether youre in propulsion manufacturing software avionics or a corporate function youll collaborate across teams shape decisions and see your work come to life in record time. Relativity is a place where creativity and technical rigor go hand in hand and your voice will help define the stories were writing together. Nowis a unique moment in time where its early enough to leave your mark on the product the process andthe culture but far enough along that Terran R is tangible and picking up momentum. The mostmeaningful work of your career is waiting. Join us.
About the Team:
The Interplanetary Sciences Program was establishedto expand access to scientific exploration across our solar system. Its mission is to make planetary research faster more affordable and more capable than ever before by rethinking how science missions are designed built andoperated. The program aims to enable scientists to send instruments to distant worlds without decades of development or prohibitive costs. By creating a sustainable model for interplanetary exploration we are transforming space science from an occasional event into a continuous process of discovery that accelerates knowledge broadens participation and inspires the next generation of explorers.
About the Role:
- Own the GPU compute environment for a space-based data center setup driver integration container runtime job scheduling and performance optimization building the platform that enables onboard AI/ML inference and SAR reprocessing millions of miles from the nearest sysadmin
- Profile and optimize compute performance across the full stack: GPU utilization memory bandwidth I/O throughput and storage interface performance squeezing maximum science return from constrained power and thermal budgets that shift between sunlit burst processing and eclipse idle periods
- Build power and thermal-aware compute scheduling that orchestrates batch workloads around orbital constraints coordinating with the storage platform to sustain 10 Gbps data movement between NAS and compute nodes during processing windows
- Develop compute health monitoring and upset recovery mechanisms checkpoint/restart strategies GPU fault detection and automated recovery so a radiation-induced upset means a restarted job not a lost processing window
- Integrate GPU drivers with the payload Linux image in coordination with the Platform RE manage the container runtime for compute workloads and ensure the platform reliably runs ML frameworks and SAR processing pipelines maintained by the broader operations team
About You:
- BS/MS in Computer Science or Electrical Engineering and 5 years of relevant experience
- Hands-on experience with GPU programming and compute frameworks CUDA ROCm or OpenCL with real performance profiling and optimization work not just running tutorials
- Strong Linux systems administration and performance tuning skills: youve diagnosed I/O bottlenecks tuned memory management and understood why a workload isnt hitting expected throughput
- Experience with container technologies (Docker Podman or lightweight alternatives) and HPC job scheduling concepts
- Working proficiency in Python for tooling scripting and ML framework integration with C/C skills for performance-critical system components
Nice to haves but not required:
- Experience with HPC cluster administration ML infrastructure or cloud GPU compute platforms at scale
- Deep familiarity with ML framework runtime requirements PyTorch or TensorFlow deployment model serving and inference optimization
- Knowledge of GPU compute architectures at the hardware level: CUDA cores compute units memory hierarchies and how they affect real workload performance
- Experience with high-throughput data movement and storage I/O optimization NFS tuning buffer management and sustaining multi-gigabit throughput
- Background in power-managed computing: duty cycling thermal throttling and workload scheduling under variable power constraints
- Experience designing checkpoint/restart or fault-tolerant batch processing systems space experience not required similar problems exist in large-scale distributed infrastructure and autonomous systems
At Relativity Space we are committed to transparency and fairness in our compensation practices. Actual compensation will be determined based on experience qualifications and other job-related factors.
Compensation is only one part of our total rewards package. Relativity Space offers competitive salary and equity a generous PTO and sick leave policy parental leave an annual learning and development stipend and more!To see some of the benefits & perks we offer please visit here.
Hiring Range:
$181000 - $248500 USD
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race religion color national origin gender sexual orientation age marital status veteran status or disability status.
If you need a reasonable accommodation please contact us at .
Required Experience:
Staff IC
About Company
A rocket company at the core, Relativity Space is on a mission to become the next great commercial launch company.