Senior AI Infrastructure Engineer

TechChain Talent

Job Location:

San Francisco, CA - USA

Yearly Salary: $ 150000 - 250000

Posted on: 30+ days ago

Vacancies: 1 Vacancy

Job Summary

About the Role

Were seeking a Senior Infrastructure Engineer to help build and scale Hyperbolics GPU Cloud Marketplace building a multi-tenancy provisioning and virtualization solution. Youll transform raw GPUs from diverse global suppliers into a programmable orchestrated pool that serves thousands of AI developers and researchers.

Requirements

- Experience with bare-metal provisioning and lifecycle management (e.g. IPMI/Redfish BMC PXE OS deployment)

- Experience with GPU scheduling and orchestration

- Experience with infrastructure and DevOps tools (e.g. Terraform or Pulumi CI/CD secrets management configuration management observability tools)

- Experience with storage and data infrastructure for AI/ML workloads (e.g. object storage block storage distributed file systems)

- Experience with API design and cloud-init

- Experience with GPU architecture CUDA and GPU compute

- Experience working with hardware vendors or vendor engineering teams

- Experience building and scaling cloud infrastructure or distributed systems in production environments

Bonus Skills

- Familiarity with high-performance networking technologies such as InfiniBand and RoCE

- Experience with distributed storage systems such as Ceph Weka or VAST Data

About the Role Were seeking a Senior Infrastructure Engineer to help build and scale Hyperbolics GPU Cloud Marketplace building a multi-tenancy provisioning and virtualization solution. Youll transform raw GPUs from diverse global suppliers into a programmable orchestrated pool that serves thousands...