About the Role
Were seeking a Senior Infrastructure Engineer to help build and scale Hyperbolics GPU Cloud Marketplace building a multi-tenancy provisioning and virtualization solution. Youll transform raw GPUs from diverse global suppliers into a programmable orchestrated pool that serves thousands of AI developers and researchers.
Requirements
- Experience with bare-metal provisioning and lifecycle management (e.g. IPMI/Redfish BMC PXE OS deployment)
- Experience with GPU scheduling and orchestration
- Experience with infrastructure and DevOps tools (e.g. Terraform or Pulumi CI/CD secrets management configuration management observability tools)
- Experience with storage and data infrastructure for AI/ML workloads (e.g. object storage block storage distributed file systems)
- Experience with API design and cloud-init
- Experience with GPU architecture CUDA and GPU compute
- Experience working with hardware vendors or vendor engineering teams
- Experience building and scaling cloud infrastructure or distributed systems in production environments
Bonus Skills
- Familiarity with high-performance networking technologies such as InfiniBand and RoCE
- Experience with distributed storage systems such as Ceph Weka or VAST Data
About the Role Were seeking a Senior Infrastructure Engineer to help build and scale Hyperbolics GPU Cloud Marketplace building a multi-tenancy provisioning and virtualization solution. Youll transform raw GPUs from diverse global suppliers into a programmable orchestrated pool that serves thousands...
About the Role
Were seeking a Senior Infrastructure Engineer to help build and scale Hyperbolics GPU Cloud Marketplace building a multi-tenancy provisioning and virtualization solution. Youll transform raw GPUs from diverse global suppliers into a programmable orchestrated pool that serves thousands of AI developers and researchers.
Requirements
- Experience with bare-metal provisioning and lifecycle management (e.g. IPMI/Redfish BMC PXE OS deployment)
- Experience with GPU scheduling and orchestration
- Experience with infrastructure and DevOps tools (e.g. Terraform or Pulumi CI/CD secrets management configuration management observability tools)
- Experience with storage and data infrastructure for AI/ML workloads (e.g. object storage block storage distributed file systems)
- Experience with API design and cloud-init
- Experience with GPU architecture CUDA and GPU compute
- Experience working with hardware vendors or vendor engineering teams
- Experience building and scaling cloud infrastructure or distributed systems in production environments
Bonus Skills
- Familiarity with high-performance networking technologies such as InfiniBand and RoCE
- Experience with distributed storage systems such as Ceph Weka or VAST Data
View more
View less