AI Lab Technical Support Specialist

SOMERSET STAFFING


Job Location:

Milpitas, CA - USA

Monthly Salary: Not Disclosed
Posted on: 8 days ago
Vacancies: 1 Vacancy

Job Summary

Role Overview


Were looking for a hands-on engineer/technician to assist with the setup maintenance and operation of our high-performance computing cluster.

This role is ideal for someone with practical experience in Linux systems in the data center who enjoys working in a fast-paced technical environment.



Key Responsibilities


Racking Stacking Cabling and maintenance the AI data center and lab.

Perform routine maintenance and troubleshooting on Linux servers storage and networking systems.

Use tools to monitor and troubleshoot hardware issues.

Work closely with engineers and developers to ensure smooth operation of the AI infrastructure.



Required Skills/Experience


Experience with assembly of mechanical or electrical systems or performing component-level repairs and troubleshooting on technical equipment.

Ability to lift/move 50lb (23kg) of equipment and ability to exert yourself physically over extended periods of time including frequent bending kneeling climbing pushing/pulling and lifting.

Experience working within a data center or network operation center environment.

Comfortable working in a Linux environment & ability to diagnose and troubleshoot issues in operating systems computer/server hardware or networking stack.

Able to write and understand simple Bash or Python scripts.

Exposure to Git Jenkins or similar tools is a plus.



Role Overview

This role is a hands-on hardware-focused technical support position supporting GPU/compute clusters in an AI lab/R&D environment. The emphasis is on hardware troubleshooting Linux-based system support and deep understanding of compute architecture rather than software development.


Key Responsibilities


Troubleshoot GPU/CPU servers compute clusters and networking (InfiniBand)

Diagnose hardware issues (cabling components GPUs servers)

Rack/stack initially limited (systems already built) but may increase if extended

Replace/install server components within racks

Use Linux command line extensively for diagnostics and system validation

Manage lab space and hardware inventory (re-procurement access provided)



Must-Have Skills (Non-Negotiable)


Strong hardware troubleshooting experience (servers GPUs compute systems)

Solid understanding of computer/compute architecture

Strong Linux skills for system bring-up and troubleshooting

Experience with GPUs and high-performance compute environments

Ability to independently diagnose and resolve hardware/system issues



Preferred / Nice-to-Have


Prior data center or HPC/compute cluster experience (plus not mandatory)

Scripting experience (Bash Python) expected if candidate has done similar roles

Familiarity with GPU technologies (cutting-edge R&D GPUs; Tesla etc.)

Candidates whove built systems themselves (gaming PCs lab servers small data centers)



Experience & Education


Minimum: 3 4 years of relevant experience (not pure sysadmin only)

Bachelors degree preferred but experience matters more than degree

No travel required

Required Skills :

Basic Qualification :

Additional Skills :

Background Check : No

Drug Screen : No

Role OverviewWere looking for a hands-on engineer/technician to assist with the setup maintenance and operation of our high-performance computing cluster.This role is ideal for someone with practical experience in Linux systems in the data center who enjoys working in a fast-paced technical environm...