Job Title: AWS High Performance Computing (HPC) Architect & Subject Matter Expert (SME)
Overview:
The AWS HPC Architect & SME is responsible for designing implementing and optimizing high-performance computing solutions on the AWS Cloud platform. This role combines deep technical expertise in distributed computing data-intensive workflows and AWS HPC services with the ability to lead architecture design sessions define best practices and ensure scalability performance and cost efficiency across enterprise or research workloads.
Key Responsibilities:
Architect and Design: Develop scalable high-performance architectures leveraging AWS HPC services such as AWS ParallelCluster FSx for Lustre EFA (Elastic Fabric Adapter) AWS Batch and EC2 HPC instances.
Solution Implementation: Deploy automate and optimize HPC clusters and data pipelines for compute- and memory-intensive workloads including modeling simulation genomics CFD AI/ML training and financial risk analysis.
Performance Optimization: Benchmark tune and monitor system performance for compute storage and networking components to achieve optimal throughput and cost efficiency.
Infrastructure as Code (IaC): Implement reproducible environments using Terraform AWS CDK or CloudFormation to streamline provisioning CI/CD and configuration management.
Data and Storage Management: Design high-throughput parallel storage solutions using S3 FSx for Lustre EBS and EFS; integrate with hybrid and on-prem HPC environments.
Security and Compliance: Apply AWS Well-Architected Framework and HPC security best practices to ensure compliance with enterprise academic or government standards.
Collaboration and Leadership: Partner with application scientists DevOps teams and business stakeholders to translate workload requirements into optimized HPC architectures. Provide mentoring and technical leadership across multidisciplinary teams.
Documentation and Knowledge Sharing: Develop architecture diagrams reference implementations and technical playbooks to support ongoing HPC adoption and operations.
Required Skills & Experience:
8 years of experience in high-performance computing distributed systems or cloud architecture.
Proven expertise in AWS HPC services (EC2 HPC ParallelCluster Batch FSx for Lustre EFA).
Strong knowledge of Linux systems administration networking (Infiniband EFA MPI) and job schedulers (Slurm Torque PBS Pro).
Hands-on experience with automation and IaC (Terraform Ansible CloudFormation).
Scripting and development proficiency (Python Bash or similar).
Experience with monitoring tools (CloudWatch Grafana Prometheus) and cost-optimization strategies.
AWS Certified Solutions Architect Professional or AWS Certified Advanced Networking preferred.
Bachelors or Masters degree in Computer Science Engineering or related technical field.
Preferred Attributes:
Experience with GPU workloads containerized HPC (ECS/EKS with ParallelCluster) or hybrid/on-prem to cloud HPC migrations.
Strong communication and presentation skills for executive and technical audiences.
Demonstrated thought leadership in HPC strategy performance benchmarking and AWS innovation.
Required Skills:
HPC on AWS Specialist For this project we will be leveraging Suse Linux Amazon PCS SLURM and FSX NetApp. We need a resource who can help architect and create the HPC environment for a EDA on AWS POC (Up to 20000 cores). 12 weeks full a perfect world we will start on 12/1.
Job Title: AWS High Performance Computing (HPC) Architect & Subject Matter Expert (SME)Overview: The AWS HPC Architect & SME is responsible for designing implementing and optimizing high-performance computing solutions on the AWS Cloud platform. This role combines deep technical expertise in distrib...
Job Title: AWS High Performance Computing (HPC) Architect & Subject Matter Expert (SME)
Overview:
The AWS HPC Architect & SME is responsible for designing implementing and optimizing high-performance computing solutions on the AWS Cloud platform. This role combines deep technical expertise in distributed computing data-intensive workflows and AWS HPC services with the ability to lead architecture design sessions define best practices and ensure scalability performance and cost efficiency across enterprise or research workloads.
Key Responsibilities:
Architect and Design: Develop scalable high-performance architectures leveraging AWS HPC services such as AWS ParallelCluster FSx for Lustre EFA (Elastic Fabric Adapter) AWS Batch and EC2 HPC instances.
Solution Implementation: Deploy automate and optimize HPC clusters and data pipelines for compute- and memory-intensive workloads including modeling simulation genomics CFD AI/ML training and financial risk analysis.
Performance Optimization: Benchmark tune and monitor system performance for compute storage and networking components to achieve optimal throughput and cost efficiency.
Infrastructure as Code (IaC): Implement reproducible environments using Terraform AWS CDK or CloudFormation to streamline provisioning CI/CD and configuration management.
Data and Storage Management: Design high-throughput parallel storage solutions using S3 FSx for Lustre EBS and EFS; integrate with hybrid and on-prem HPC environments.
Security and Compliance: Apply AWS Well-Architected Framework and HPC security best practices to ensure compliance with enterprise academic or government standards.
Collaboration and Leadership: Partner with application scientists DevOps teams and business stakeholders to translate workload requirements into optimized HPC architectures. Provide mentoring and technical leadership across multidisciplinary teams.
Documentation and Knowledge Sharing: Develop architecture diagrams reference implementations and technical playbooks to support ongoing HPC adoption and operations.
Required Skills & Experience:
8 years of experience in high-performance computing distributed systems or cloud architecture.
Proven expertise in AWS HPC services (EC2 HPC ParallelCluster Batch FSx for Lustre EFA).
Strong knowledge of Linux systems administration networking (Infiniband EFA MPI) and job schedulers (Slurm Torque PBS Pro).
Hands-on experience with automation and IaC (Terraform Ansible CloudFormation).
Scripting and development proficiency (Python Bash or similar).
Experience with monitoring tools (CloudWatch Grafana Prometheus) and cost-optimization strategies.
AWS Certified Solutions Architect Professional or AWS Certified Advanced Networking preferred.
Bachelors or Masters degree in Computer Science Engineering or related technical field.
Preferred Attributes:
Experience with GPU workloads containerized HPC (ECS/EKS with ParallelCluster) or hybrid/on-prem to cloud HPC migrations.
Strong communication and presentation skills for executive and technical audiences.
Demonstrated thought leadership in HPC strategy performance benchmarking and AWS innovation.
Required Skills:
HPC on AWS Specialist For this project we will be leveraging Suse Linux Amazon PCS SLURM and FSX NetApp. We need a resource who can help architect and create the HPC environment for a EDA on AWS POC (Up to 20000 cores). 12 weeks full a perfect world we will start on 12/1.
View more
View less