LinuxHPC Systems Engineer

COLSA Corporation

Not Interested
Bookmark
Report This Job

profile Job Location:

Huntsville, AL - USA

profile Monthly Salary: Not Disclosed
Posted on: Yesterday
Vacancies: 1 Vacancy

Job Summary

In this role your daily impact spans the entire spectrum of systems engineering. One hour you might be performing routine lifecycle maintenancepatching a fleet of RHEL workstations or managing user identities across a heterogeneous domainto ensure the baseline stability of our enterprise. The next you are diving into the high-performance fabric debugging a latency spike on an InfiniBand card or fine-tuning a Slurm scheduler to prioritize a mission-critical simulation.

You arent just managing boxes; you are the bridge between raw silicon and national security breakthroughs. Whether its the methodical hardening of a standard server build to meet SAP requirements or the high-adrenaline optimization of a multi-petabyte Lustre filesystem your work ensures that our researchers never have to wait on the infrastructure to catch up with their imagination. This position is 100% on-site.

Responsibilities

  • Architect & Deploy: Lead the design and lifecycle management of mission-critical Linux workstations enterprise-grade servers and high-performance computing (HPC) clusters.
  • Engineer Filesystems: Master the art of data movement. Administer complex local and distributed filesystems (Lustre GPFS/Spectrum Scale) to ensure extreme-speed access across the fabric.
  • Infrastructure as Code (IaC): Treat the data center as a codebase. Develop sophisticated automation workflows using Python Bash and Ansible to eliminate manual toil and ensure drift-free configurations.
  • Defensive Engineering: Implement Hardened by Design security. Fine-tune SELinux policies and advanced firewall configurations to protect sensitive data without sacrificing computational performance.
  • Container Orchestration: Modernize scientific workflows by deploying and managing isolated environments using Podman while working to establish a Kubernetes environment.
  • HPC Performance Tuning: Push the limits of the silicon. Optimize cluster scheduling and management utilizing industry-leading tools like Bright Cluster Manager and Slurm.
  • Low-Latency Networking: Configure and optimize high-bandwidth networking including InfiniBand fabrics for seamless inter-node communication.
  • Technical Documentation: Author high-fidelity playbooks and strategic architectural diagrams that serve as the blueprint for our evolving infrastructure.

At COLSA people are our most valuable resource and centered at our core value. We invite you to unite your talents with opportunity and be a part of our FamilyofProfessionals!Learn about our employee-centric culture and benefitshere.


Required Experience:

IC

In this role your daily impact spans the entire spectrum of systems engineering. One hour you might be performing routine lifecycle maintenancepatching a fleet of RHEL workstations or managing user identities across a heterogeneous domainto ensure the baseline stability of our enterprise. The next y...
View more view more

Key Skills

  • Air Freight
  • Accounting & Finance
  • Electrical Commissioning
  • General Services
  • Civil Engineering
  • Linux

About Company

Company Logo

Leading Solutions in Defense, Intelligence, Space, & Civilian Markets Explore Our CapabilitiesCome See Us at Booth #713September 24-26, 2024Read MoreMeet Our Data Science LabThe Art of PossibleLearn MoreGrow With UsExplore COLSA job openingsClick Here Previous slide Next slide Leading ... View more

View Profile View Profile