HPC Network Solutions Architect

Not Interested
Bookmark
Report This Job

profile Job Location:

Marshall County, WV - USA

profile Monthly Salary: Not Disclosed
Posted on: 2 hours ago
Vacancies: 1 Vacancy

Job Summary

HPC Network Solutions Architect
Location: Dallas TX (Hybrid)
Type: Direct Hire

Competitive base salary performance bonus
100% company-paid benefits

Overview

We are seeking an HPC Network Solutions Architect to design integrate and optimize high-performance networking architectures supporting HPC AI/ML and data-intensive workloads.

This is a customer-facing technically focused role responsible for guiding clients across the full solution lifecyclefrom requirements gathering and architecture design through proof-of-concept deployment and long-term optimization. The role bridges advanced networking technologies with real-world HPC adoption ensuring low-latency high-bandwidth infrastructure aligns with workload demands.

The ideal candidate brings deep expertise in HPC networking strong experience across InfiniBand and Ethernet-based architectures and the ability to translate complex requirements into scalable production-ready solutions.

Key Responsibilities

Customer Engagement & Architecture Leadership

Serve as the primary networking subject matter expert for customers adopting or scaling HPC environments
Capture performance goals scalability requirements and integration constraints to inform solution design
Lead customer workshops architecture reviews and technical design sessions

HPC Network Architecture & Design

Design and document end-to-end HPC network architectures including Ethernet InfiniBand RoCE EVPN and VXLAN fabrics
Define scalable low-latency network designs aligned with HPC and AI/ML workload requirements
Develop architecture blueprints and integration strategies across compute storage orchestration and security layers

Performance Optimization & Benchmarking

Lead proof-of-concept and benchmarking initiatives to validate network performance and throughput
Conduct network performance assessments tuning and optimization to eliminate bottlenecks
Address scaling challenges such as data gravity east-west traffic and high-throughput demands

Observability & Monitoring

Design and implement observability frameworks using Prometheus Grafana and vendor telemetry tools
Provide visibility into network health utilization and performance across large-scale environments

Cross-Functional Collaboration

Partner with engineering product and operations teams to refine architecture standards and delivery practices
Collaborate with compute storage and platform teams to ensure integrated workload-aware solutions
Support multi-vendor environments and evaluate new networking technologies

Vendor & Ecosystem Engagement

Work closely with vendors such as NVIDIA Mellanox Cisco and Arista to integrate advanced capabilities
Influence vendor roadmaps through feedback and joint evaluations
Stay current on emerging HPC networking technologies and provide forward-looking guidance to customers

Thought Leadership & Innovation

Represent the organization in customer engagements workshops and industry events
Provide strategic insight into future networking trends including interconnect advancements and scalable architectures
Contribute to best practices and reusable architectural patterns

Required Experience

Proven experience in HPC networking architecture data center network engineering or large-scale distributed systems design
Deep expertise with InfiniBand and RoCE including deployment and tuning in production environments
Strong experience designing large-scale Ethernet networks using BGP OSPF EVPN and VXLAN
Understanding of GPU communication frameworks such as MPI and NCCL and their interaction with HPC interconnects
Experience working in Linux environments with scripting skills (Python Bash or PowerShell) for automation
Experience supporting multi-vendor networking environments
Ability to translate complex technical requirements into clear scalable architectures
Strong customer-facing communication skills with experience engaging both technical and executive stakeholders

Technical Skills

Experience with network observability and telemetry platforms
Familiarity with automation and Infrastructure-as-Code tools such as Terraform and Ansible
Exposure to CNI plugins such as Multus Cilium and NVIDIA CNI for Kubernetes/HPC environments

Preferred Experience

Experience delivering HPC or AI/ML workloads across large-scale low-latency network environments
Experience collaborating with vendors and influencing product direction
Contributions to open-source HPC or networking projects
Bachelors or Masters degree in Computer Science Networking Engineering or related field
Certifications such as Cisco CCNP/CCIE Juniper JNCIP AWS Advanced Networking Specialty or Red Hat RHCE

HPC Network Solutions Architect Location: Dallas TX (Hybrid) Type: Direct Hire Competitive base salary performance bonus 100% company-paid benefitsOverviewWe are seeking an HPC Network Solutions Architect to design integrate and optimize high-performance networking architectures supporting HPC AI/...
View more view more