Berkeley Labs (LBNL) Information Technology Division (IT) has an opening for a Senior HPC Cluster Systems Administrator to join their ScienceIT Team!
In this exciting role you will support the Berkeley Lab research community by building integrating and maintaining Linux-based resources high-performance computing cluster systems and Kubernetes clusters. This role provides extensive expertise in High Performance Computing infrastructure and delivers advanced Linux solutions to further scientific endeavors at Berkeley Lab. The mission of Scientific Computing under ScienceIT is to facilitate groundbreaking fundamental research globally by providing essential computing tools networks and expertise to enable pioneering science.
This position has an anticipated start date of January 5 2026.
Were here for the same mission to bring science solutions to the world. Join our team and YOU will play a supporting role in our goal to address global challenges! Have a high level of impact and work for an organization associated with 17 Nobel Prizes!
Why join Berkeley Lab
We invest in our employees by offering a total rewards package you can count on:
- Exceptional health and retirement benefits including pension or 401K-style plans
- Opportunities to grow in your career - check out our Tuition Assistance Program
- A culture where youll belong - we are invested in our teams!
- In addition to accruing vacation and sick time we also have an annual Winter Holiday Shutdown
- Parental bonding leave (for both mothers and fathers)
- Pet insurance
What You Will Do:
- Perform Linux system and HPC cluster maintenance and installations operating system upgrades system security hardening and intrusion detection storage and file system management system hardware customization of user group working environment troubleshooting network monitoring and crash recovery.
- Design deploy and manage scalable applications using Kubernetes ensuring the availability performance and readiness of the Kubernetes infrastructure.
- Automate deployment scaling and management of containerized applications and collaborating with DevOps and development teams to streamline CI/CD pipelines.
- Design deploy and manage the global storage platform to ensure high performance massive scalability reliability and future-proof solutions.
- Support storage technologies such as Lustre VAST and networks.
- Resolve I/O issues related to business applications including diagnosing and resolving complex storage Linux and networking challenges in a fast-paced environment.
- Research new storage management technologies techniques and provide recommendations.
- Participate in developing system administration security and network policies documentation and tools oriented towards efficient systems management.
- Participate in cluster support to staff and researchers including initial installation integration and ongoing maintenance of Linux High-Performance Computing cluster systems. This includes travel to remote sites if as needed.
- Co-leading technical efforts with other senior system administrators in areas of HPC technologies such as job schedulers high-performance interconnects parallel file systems cybersecurity cluster management container orchestration VM infrastructure networking performance tuning or data center planning.
- Co-leading group projects of small to medium size and complexity to implement and deploy new computing technologies and associated services to the research community.
What We Are Looking For:
- A Bachelors Degree (or equivalent knowledge/training) in Computer Science Engineering or a related discipline and a minimum of 12 years of relevant experience in Linux system administration within a large distributed computing environment including experience providing systems and end-user support for multiple scientific or computational research groups or an equivalent combination of education and experience.
- Demonstrated ability to manage large-scale performance-critical environments including capacity planning scaling and optimization.
- Significant experience deploying scaling and managing Kubernetes clusters with a strong understanding of its architecture (pods deployments services ingress) and container orchestration. Proven proficiency with CI/CD tools like Jenkins or GitLab CI.
- Proven experience with Red Hat derivatives (CentOS Scientific Linux Rocky Linux) Debian Ubuntu and large-scale system and configuration management tools (Kickstart Ansible Puppet Chef Warewulf). Expertise in supporting standard services (NFS LDAP SMB MySQL Apache/Nginx HTTPD).
- Strong HPC expertise including Linux job schedulers high-performance interconnects parallel file systems cybersecurity container orchestration cluster management VM infrastructure networking performance tuning scientific application support and data center planning.
- Proficiency in Python and Bash for building optimizing and debugging scientific codes (C C Fortran Java) including experience with compilers (GCC Intel) debuggers Makefiles and version-control (git Subversion).
- Expertise in storage system design and optimization (Lustre S3 VAST Weka Ceph DDN) including a deep understanding of the storage stack (kernel to user space including file systems block storage I/O schedulers VFS) storage benchmarking and performance tuning (throughput latency IOPS workload-specific optimizations).
- Excellent oral and written communication skills including experience organizing and presenting customer focused technical data reports and projects to audiences with varying degrees of technical expertise.
- Strong interpersonal skills including experience with research facilitation and project management in a multidisciplinary team environment.
Desired Qualifications:
- An Advanced Degree (or equivalent knowledge/training) in Computer Science Engineering or a related discipline.
- Experience with software engineering and/or software development.
- Familiarity with Kubernetes-related tools like Helm Istio and Prometheus.
- Demonstrated experience supporting research at a National Lab and/or in an academic or research environment.
Additional Information:
- Application Deadline: For full consideration please apply with a resume and a cover letter describing your interest by November 30 2025.
- Appointment type: This is a full-time career appointment exempt (monthly paid) from overtime pay.
- Salary Information: This position is expected to pay $178644 - $218364 annually which fits within the full salary range of $158808 - $267996 annually for job code C70.4. It is not typical for an individual to be offered a salary at or near the top of the range for a position. Salary for this position will be commensurate with the final candidates qualification and experience including skills knowledge relevant education certifications and aligned with the internal peer group.
- Background Check: This position may be subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.
- Work Modality: This position is eligible for a hybrid work schedule - a combination of teleworking and performing work on site at Lawrence Berkeley National Lab 1 Cyclotron Road Berkeley CA 94720. Work schedules are dependent on business needs. Individuals working a hybrid schedule must reside within 150 miles of Berkeley Lab. Starting May 7 a REAL ID or other acceptable form of identification is required to access Berkeley Lab sites (for more information click here).
- Relocation: This position is eligible for relocation assistance.
- Work Authorization: Applicants must be legally authorized to work in the United States. Berkeley Lab does not provide visa sponsorship for this position.
Want to learn more about working at Berkeley Lab Please visit:
Equal Employment Opportunity Employer: The foundation of Berkeley Lab is our Stewardship Values: Team Science Service Trust Innovation and Respect; and we strive to build community with these shared values and commitments. Berkeley Lab is an Equal Opportunity Employer. We heartily welcome applications from all who could contribute to the Labs mission of leading scientific discovery excellence and support of our rich global community all qualified applicants will be considered for employment without regard to race color religion sex sexual orientation gender identity national origin disability age protected veteran status or other protected categories under State and Federal law.
Berkeley Lab is a University of California employer. It is the policy of the University of California to undertake affirmative action and anti-discrimination efforts consistent with its obligations as a Federal and State contractor.
Misconduct Disclosure Requirement: As a condition of employment the finalist will be required to disclose if they are subject to any final administrative or judicial decisions within the last seven years determining that they committed any misconduct are currently being investigated for misconduct left a position during an investigation for alleged misconduct or have filed an appeal with a previous employer.