In this role you will be responsible for supporting testing and deploying HPC infrastructure products at our operations core. You will help plan code build test deploy operate and monitor our Infrastructure-as-Code solutions for HPC server responsibilities will include:Demonstrating strong troubleshooting skills by independently identifying and resolving system performance and availability and remediate issues as automation for common development and operational clear current documentation of system configurations including creating detailed justifications training materials for complex topics status reports and procedural with Application infrastructure network and storage engineering teams to find balanced solutions to engineering future capacity requirements and evaluating new product features or enhancements.
A Bachelors degree in Computer Science with at least 5 years of relevant experience or equivalent professional background.
Proven experience in an HPC support role in an enterprise environment with 500 node clusters.
Experience deploying and managing schedulers such as SLURM LSF and/or NC.
Deploying and configuring FEA Solvers to run on HPC
Experience with NVIDIA GPU compute.
Strong Linux administration skills.
Experience with InfiniBandincluding IBoIP and RDMA
Experience with multiple flavors of MPI
Experience with machine learning and deep learning concepts algorithms and models.
Background in Software Defined Networking and AI/HPC cluster networking.
Familiarity with deep learning frameworks such as PyTorch and TensorFlow.
Experience with automation and configuration management tools like Ansible Cobbler & Puppet.
Experience developing and securing containerized applications and HPC environments beneficial (e.g. Apptainer).
Experience with virtualization technologies is beneficial
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.