High-Performance Computing (HPC) System Engineer

LLNL

Not Interested
Bookmark
Report This Job

profile Job Location:

Livermore, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 19-09-2025
Vacancies: 1 Vacancy

Job Summary

We have an opening for a High-Performance Computing (HPC) System Engineer to support one of the largest supercomputer centers in the world. You will work in a challenging and team-oriented environment supporting Livermore Computings (LC) high performance computing clusters. You will apply fundamental knowledge of HPC systems and contribute to technical projects using creativity and imagination. This position is in the Livermore Computing Division within the Computing Directorate.

This position may offer a hybrid schedule which includes the flexibility to work from home one or more days per week after a probationary period. The specifics of the hybrid schedule including the exact number of days required in the office and virtual work options may vary based on the needs of the team and the organization. 

This position will be filled at either level based on knowledge and related experience as assessed by the hiring team. Additional job responsibilities (outlined below) will be assigned if hired at the higher level.

You will

  • Administer and deploy multiple Linux-based HPC Infrastructure and Parallel file system servers and clusters.
  • Contribute to the deployment configuration and management of high-speed cluster fabrics for computer and storage networks.
  • Conduct installations of software releases patches of the operating system and third-party utilities with emphasis on overall system security.
  • Work with system administrators Hotline and Operations staff to improve the quality of service for end users.
  • Analyze diagnose and respond to system problems and user questions in person via email and trouble ticket system while collaborating with other team members.
  • Troubleshoot and determine root cause of system issues with limited complexity.
  • Perform other duties as assigned.

Additional job responsibilities at the SES.2 level

  • Manage and deploy multiple RAID controllers and disk enclosures systems.
  • Analyze performance and implement moderately complex strategies to improve the operation and efficiency of the computer network file system and disk sub-systems.
  • Develop and maintain programs and scripts that aid in the operation and automation of administrative tasks.

Qualifications :

  • Ability to secure and maintain a U.S. DOE Q-level security clearance which requires U.S. citizenship.
  • Bachelors degree in computer science or related field or the equivalent combination of education and related experience.
  • Fundamental experience with Linux/Unix systems including installation configuration networking backups updates and patching and system security.
  • Understanding of programming and scripting languages such as C/C Java Perl Python and bash/csh/ksh.
  • Experience with version control Terraform/OpenTofu and configuration management systems such as git Ansible or cfengine.
  • Experience with disk and storage systems such as host-based RAID controllers software RAID and vendor RAID systems.
  • Sufficient communication and interpersonal skills necessary to effectively work with members of the system administration group application developers LC staff and end users.
  • Ability to serve on a rotating off-hours on-call list.

Additional qualifications at the SES.2 level

  • Comprehensive knowledge of HPC environments and HPC technologies such as Infiniband Slurm Lustre.
  • Proficient experience with local parallel and distributed file systems such as Ext4 XFS NFS ZFS Lustre and GPFS.
  • Ability to work with limited direction in a dynamic environment with competing priorities.

Qualifications We Desire

  • Masters degree in computer science or related field.
  • Experience developing software with C/C or Python within Linux or UNIX environments.
  • Experience with VMs and Container technologies (e.g. singularity docker podman lxc) and Kubernetes.

Additional Information :

#LI-Hybrid

Position Information

This is a Career Indefinite position open to Lab employees and external candidates.

Why Lawrence Livermore National Laboratory

We have an opening for a High-Performance Computing (HPC) System Engineer to support one of the largest supercomputer centers in the world. You will work in a challenging and team-oriented environment supporting Livermore Computings (LC) high performance computing clusters. You will apply fundamenta...
View more view more

Key Skills

  • JProfiler
  • Splunk
  • Performance Testing
  • Fiddler
  • Apache
  • HP Performance Center
  • LoadRunner
  • New Relic
  • Scalability
  • J2EE
  • Java
  • Scripting

About Company

Join us and make YOUR mark on the World!Are you interested in joining some of the brightest talent in the world to strengthen the United States’ security? Come join Lawrence Livermore National Laboratory (LLNL) where our employees apply their expertise to create solutions for BIG idea ... View more

View Profile View Profile