High Performance Compute (HPC) Software Engineer – HPC SW Systems
Ann Arbor, MI - USA
Job Summary
Company Overview
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop smartphone wearable device voice-controlled gadget flexible screen VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles integrated circuits packaging printed circuit boards and flat panel displays. The innovative ideas and devices that are advancing humanity all begin with inspiration research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists engineers data scientists and problem-solvers work together with the worlds leading technology providers to accelerate the delivery of tomorrows electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.Job Description/Preferred Qualifications
Key Responsibilities
HPC Software Engineering
Design develop and optimize HPC software running on large-scale Linux clusters including distributed and parallel workloads (MPI multithreading GPU-accelerated pipelines containerized workloads).
Optimize application performance and power utilization across CPU memory storage and network subsystem with attention to throughput latency and scaling behavior.
Develop and maintain system-level tooling for cluster bring-up diagnostics monitoring including component power usages and health checks.
Work closely with algorithms systems and application teams to understand and translate workload characteristics into power-efficient HPC software solutions.
HPC Systems & Hardware Awareness
Collaborate with hardware and systems teams to define HPC node storage and interconnect requirements based on software and algorithm needs.
Understand and influence CPU/GPU selection memory sizing PCIe layout NUMA behavior and network topology to ensure optimal software performance.
Participate in HW/SW co-debug activities including performance bottlenecks stability issues and failure analysis.
Rack & Infrastructure Engineering
Understand rack-level integration of HPC systems focusing on power cooling cabling networking and physical layout considerations.
Understand data-center and lab constraints such as power budgets thermal limits network drops and serviceability.
Contribute to best practices and design reviews for new platforms and refresh cycles.
Cross-Functional Collaboration
Act as a technical bridge between software hardware systems teams.
Provide clear technical documentation covering software and system architecture deployment flows performance assumptions.
Required Qualifications
Bachelors or Masters degree in Computer Science Computer Engineering Electrical Engineering or equivalent practical experience.
Strong experience developing HPC or systems software on Linux.
Proficiency in Java and/or C and/or other system-level or performance-oriented languages.
Hands-on experience with parallel computing (MPI OpenMP multithreading). Candidates with GPU computing (CUDA ROCm or equivalent) would be preferred.
Solid understanding of HPC hardware fundamentals: CPUs memory hierarchies storage networking (Ethernet / InfiniBand).
Practical experience working with clusters servers or rack-scale systems in lab or production environments.
Strong debugging skills across software OS and hardware boundaries.
Preferred Qualifications
Experience with containerized HPC environments (Docker Singularity/Apptainer Kubernetes in HPC contexts).
Familiarity with high-speed interconnects storage architectures and performance benchmarking.
Exposure to rack integration including cabling power distribution cooling and system bring-up.
Experience in semiconductor manufacturing or high-reliability systems environments.
Ability to reason about system reliability MTBF/MTBA and failure modes in large compute installations.
What Makes This Role Unique at KLA
Work on mission-critical HPC platforms that directly impact semiconductor manufacturing capability.
Influence both software architecture and physical system design not just code in isolation.
Collaborate with world-class experts across algorithms hardware systems and operations.
See your work deployed at scale in real production toolsnot just in the data center.
Minimum Qualifications
Doctorate (Academic) Degree and 0 years related work experience; Masters Level Degree and related work experience of 3 years; Bachelors Level Degree and related work experience of 5 years
Base Pay Range: $105900.00 - $180000.00 AnnuallyPrimary Location: USA-MI-Ann Arbor-KLAKLAs total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical dental vision life and other voluntary benefits 401(K) including company matching employee stock purchase program (ESPP) student debt assistance tuition reimbursement program development and career growth opportunities and programs financial planning benefits wellness benefits including an employee assistance program (EAP) paid time off and paid company holidays and family care and bonding leave.Interns are eligible for some of the benefits listed. Our pay ranges are determined by role level and location. The range displayed reflects the pay for this position in the primary location identified in this posting. Actual pay depends on several factors including state minimum pay wage rates location job-related skills experience and relevant education level or training. We are committed to complying with all applicable federal and state minimum wage requirements where applicable. If applicable your recruiter can share more about the specific pay range for your preferred location during the hiring process.
KLA is proud to be an Equal Opportunity Employer. We will ensure that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process to perform essential job functions and to receive other benefits and privileges of employment. Please contact us at or at 1- to request accommodation.
Be aware of potentially fraudulent job postings or suspicious recruiting activity by persons that are currently posing as KLA employees. KLA never asks for any financial compensation to be considered for an interview to become an employee or for equipment. Further KLA does not work with any recruiters or third parties who charge such fees either directly or on behalf of KLA. Please ensure that you have searched KLAs Careers website for legitimate job postings. KLA follows a recruiting process that involves multiple interviews in person or on video conferencing with our hiring managers. If you are concerned that a communication an interview an offer of employment or that an employee is not legitimate please send an email to to confirm the person you are communicating with is an employee. We take your privacy very seriously and confidentially handle your information.
Required Experience:
IC
About Company
Calling the adventurers ready to join a company that's pushing the limits of nanotechnology to keep the digital revolution rolling. At KLA, we're making technology advancements that are bigger—and tinier—than the world has ever seen. Who are we? We research, develop, and manufacture t ... View more