HPC Admin
Experience: 6 8 Years
Location: India Chennai
Minimum 5 years experience in the below skills
HPC systems
Clusters
Linux systems
HPC HW knowledge especially in the server GPU networking Storage BIOS & BMC arenas
TCP/IP fundamentals
BE/BTech or MS degree 6 to 10 years validated experience
Computer Engineering or Electrical Engineer related fields
Key Responsibilities:
Design implementation & support of highperformance compute clusters
Solid knowledge on HPC systems including CPU/GPU architecture scalable/robust storage highbandwidth interconnects and a knowledge of cloud based computing architectures
Apply their attention to detail to generate HW BOMs for the HCP Clusters provide vendor management and oversee HW release activities.
Use their strong skills with the Linux OS to configure appropriate operating systems for the HPC system
Understand and assemble the project specifications and performance requirements at the subsystem and system levels. Adhere and drive to project timelines to insure program achievements complete on time.
Support design and release of new products to manufacturing and ultimately the customer providing quality golden images procedures scripts and documentation to the manufacturing team and customer support team.
Required Qualifications:
Validated indepth and flavor agnostic knowledge of Linux systems (SuSE RedHat Rocky Ubuntu)
Experience of crafting and maintaining robust storage
Strong HPC HW knowledge especially in the server GPU networking Storage BIOS & BMC arenas.
Experience in SystemD Net boot/PXE Linux HA.
Strong understanding of TCP/IP fundamentals and knowledge of protocols DNS DHCP HTTP LDAP SMTP.
Ability to code and develop Shell and Python scripts.
Experience with one or more of the listed Configuration Mgmt utilities. (Salt Chef Puppet etc) .
Preferred Qualifications:
Possess a strong DevOps focus: Knowledge of setting up a continuous development pipeline (Jenkins) Repository software (Gitbased) Singularity & Docker Containers.
Kubernetes Prometheus & Grafana experience
Knowledge of Apache/Nginx Setting up proxy/reverse proxy application server routing load balancing (HA Proxy)
BS or MS degree 6 to 10 years validated experience
Computer Engineering or Electrical Engineer related fields
Skills and Abilities:
Team Orientation & Interpersonal Highly motivated teammate with ability to develop and maintain collaborative relationships with all levels within and external to the organization.
Organization & Time Management Able to plan schedule organize and follow up on tasks related to the job to achieve goals within or ahead of established time frames.
Multitask Ability to expeditiously organize coordinate manage prioritize and perform multiple tasks simultaneously to swiftly assess a situation determine a logical course of action and apply the appropriate response.
Adaptability to Change Able to be flexible and supportive and able to assimilate change positively and proactively in rapid growth environment.
Outstanding teammate with excellent written and verbal communications skills.
git,configuration management (salt, chef, puppet),docker,grafana,apache/nginx,hpc hw knowledge,prometheus,hpc systems,high performance computing (hpc),admin,shell scripting,tcp/ip fundamentals,linux systems,kubernetes,clusters,bios,load balancing,devops,gpu,computer engineering,storage,jenkins,linux,python scripting