We are seeking an experiencedSr. Systems Engineerto support a small standalone system dedicated to high-performance computing (HPC) and artificial intelligence (AI) workloads. This role demands a blend of operational expertise and strategic technical vision focusing on the management and optimization of our Partners standalone HPC/AI system. The ideal candidate will manage the technical operation of their infrastructure develop standardized procedures for hardware network and software management across the system and expertly oversee cluster management (including provisioning optimization and monitoring of clustered resources for HPC/AI workloads such as NVIDIA BCM).
This position requires broad expertise in HPC/AI system administration with a focus on:
Refining infrastructure management frameworks
Traditional infrastructure management (hardware networking directory services)
Modern HPC/AI support (Linux/Ubuntu Proxmox NVIDIA BCM WEKA storage)
Designing scalable secure and highly available system architectures
Requirements
TS/SCI FSP Clearance on day one
Bachelors degree in engineering computer science or related technical field or equivalent experience
7 years experience in systems engineering or related fields
Operating Systems & Infrastructure:
Expert-level Linux systems engineering
Windows client operating systems deployment/maintenance
Linux (Ubuntu) server operating systems deployment/maintenance
Hardware & Networking:
Server hardware
Network hardware wiring and switching configurations
Virtualization & Containerization:
Virtualization (ideally Proxmox)
Containerization (ideally Docker/Podman with Ray or Kubernetes)
Management & Orchestration:
Directory services and PKI infrastructure deployment/maintenance
Configuration management (ideally Ansible Puppet Chef or DSC)
Cluster orchestration (ideally NVIDIA Base Cluster Management (BCM))
Development Support & Software Management:
Development support services (Gitlab Jenkins Nexus)
Operating system software repository synchronization (Apt Snap Yum)
Desired Skills
Supporting or developing on standalone networks
Supporting or developing HPC or AI workloads using hardware acceleration
Experience with compliance with enterprise data policy
Experience with system security policy and accreditation processes
About Us For more than 20 years NewGen Technologies has solved our clients toughest IT challenges with integrity security and outstanding service by delivering both technology and talent. We have helped secure borders have used artificial intelligence (AI) to fight terror aided the identification of criminals and have helped to prevent crime through the introduction of team of Highly Cleared Specialists have hard-to-find skills and expertise in a wide spectrum of technologies to provide solutions that transform business processes and solve problems of national significance. #CJ
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.