HPC Systems Engineer
Ann Arbor, MI - USA
Job Summary
Company Overview
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop smartphone wearable device voice-controlled gadget flexible screen VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles integrated circuits packaging printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists engineers data scientists and problem-solvers work together with the worlds leading technology providers to accelerate the delivery of tomorrows electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.Group/Division
With over 40 years of semiconductor process control experience chipmakers around the globe rely on KLA to ensure that their fabs ramp next-generation devices to volume production quickly and cost-effectively. Enabling the movement towards advanced chip design KLAs Global Products Group (GPG) which is responsible for creating all of KLAs metrology and inspection products is looking for the best and the brightest research scientist software engineers application development engineers and senior product technology process engineers. Central Engineering is KLAs largest engineering organization comprised of 9 Centers-of-Excellence (CoE) in various disciplines applied across all product groups in the company. These CoE include Handling & Automation Precision Motion Control Sensors & Image Acquisition Platform Design and Packaging Engineering among others. Talent includes over 500 engineers across global centers in Israel China India and the US. Each CoE contributes not just talent and deliverables per discipline toward product programs but also subject matter expertise best practices roadmaps specialized facilities apparatus models and analytics. These differentiate KLA not only in WHAT we do but also in HOW we do it.Job Description/Preferred Qualifications
Were looking for a HPC Systems Engineer to help power the compute infrastructure behind our R&D innovation! In this role youll support and evolve a highperformance Linux cluster used for physics modeling simulation algorithm development and machinelearning workloadsenabling hundreds of engineers to do their best work every day. Youll play a key role in driving the reliability performance and scalability of a shared missioncritical HPC environment partnering closely with infrastructure DevOps and application teams to keep the platform fast resilient and ready for the most demanding computational challenges!
Key Responsibilities:
HPC Platform Operations
Operate and maintain a large-scale Linux based HPC cluster used for internal R&D workloads
Manage compute nodes login nodes and supporting infrastructure in a multi-tenant environment
Monitor cluster health performance and capacity; respond to incidents and degradations
Scheduler & Workload Management
Configure tune and support HPC job schedulers (e.g. SLURM LSF PBS or equivalent)
Assist users with job submission issues resource requests and queue optimization
Help optimize scheduler policies to balance throughput fairness and utilization
Linux Systems Engineering
Install configure and maintain Linux operating systems across compute and service nodes
Manage OS updates kernel changes drivers (including GPU drivers where applicable) and system hardening
Troubleshoot complex Linux performance networking storage and process level issues
Performance & Scaling
Support high throughput and parallel workloads across CPU and GPU resources
Diagnose performance bottlenecks across compute storage network and scheduler layers
Assist with scaling activities such as node expansions re provisioning and hardware refreshes
Automation & Reliability
Use automation and configuration management tools to ensure consistency across the cluster
Contribute to scripting and tooling for node provisioning validation and lifecycle management
Participate in on call or escalation rotations as required to support a production R&D platform
Collaboration & User Support
Partner with internal engineering teams to understand workload requirements and usage patterns
Provide guidance and best practices for running workloads efficiently on shared HPC systems
Contribute to internal documentation and operational runbooks
Required Qualifications:
- Bachelors degree in Computer Science Engineering or equivalent practical experience
- 3 years of handson Linux systems administration experience
- Direct experience working with HPC or largescale compute environments
- Practical experience with at least one HPC scheduler (SLURM LSF PBS or similar)
- Strong Linux troubleshooting skills (processes memory I/O networking performance analysis)
- Comfort working in CLIdriven production infrastructure environments
Preferred:
- Experience supporting GPUaccelerated workloads (CUDA drivers GPU scheduling concepts)
- Familiarity with parallel computing or scientific/engineering workloads
- Experience with cluster storage systems (e.g. Lustre BeeGFS NFS or highperformance NAS/SAN)
- Exposure to automation tools (Ansible scripting InfrastructureasCode concepts)
- Familiarity with containers in HPC contexts (Singularity / Apptainer rootless containers)
- Experience supporting internal developer or research communities
Minimum Qualifications
Doctorate (Academic) Degree and 0 years related work experience; Masters Level Degree and related work experience of 3 years; Bachelors Level Degree and related work experience of 5 years
Base Pay Range: $105900.00 - $180000.00 AnnuallyPrimary Location: USA-MI-Ann Arbor-KLAKLAs total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical dental vision life and other voluntary benefits 401(K) including company matching employee stock purchase program (ESPP) student debt assistance tuition reimbursement program development and career growth opportunities and programs financial planning benefits wellness benefits including an employee assistance program (EAP) paid time off and paid company holidays and family care and bonding leave.Interns are eligible for some of the benefits listed. Our pay ranges are determined by role level and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors including state minimum pay wage rates location job-related skills experience and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable your recruiter can share more about the specific pay range for your preferred location during the hiring process.
KLA is proud to be an Equal Opportunity Employer. We will ensure that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process to perform essential job functions and to receive other benefits and privileges of employment. Please contact us at or at 1- to request accommodation.
Be aware of potentially fraudulent job postings or suspicious recruiting activity by persons that are currently posing as KLA employees. KLA never asks for any financial compensation to be considered for an interview to become an employee or for equipment. Further KLA does not work with any recruiters or third parties who charge such fees either directly or on behalf of KLA. Please ensure that you have searched KLAs Careers website for legitimate job postings. KLA follows a recruiting process that involves multiple interviews in person or on video conferencing with our hiring managers. If you are concerned that a communication an interview an offer of employment or that an employee is not legitimate please send an email to to confirm the person you are communicating with is an employee. We take your privacy very seriously and confidentially handle your information.
Required Experience:
IC
About Company
Calling the adventurers ready to join a company that's pushing the limits of nanotechnology to keep the digital revolution rolling. At KLA, we're making technology advancements that are bigger—and tinier—than the world has ever seen. Who are we? We research, develop, and manufacture t ... View more