Software Engineer, Inference â GPU Enablement

OpenAI

Posted on : 13-05-2025

Employer Active

1 Vacancy

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Send me jobs like this

Job Alert

You will be updated with latest job alerts via email

Valid email field required

Send jobs

Job Location

San Francisco, CA - USA

Yearly Salary

USD 310000 - 460000

Vacancy

1 Vacancy

Posted on : 13-05-2025

Job Description

About the Team
OpenAIs Inference team ensures that our most advanced models run efficiently reliably and at scale. We build and optimize the systems that power our production APIs internal research tools and experimental model deployments. As model architectures and hardware evolve were expanding support for a broader set of compute platforms for example AMD GPUs to increase performance flexibility and resiliency across our infrastructure.

We are forming a team to generalize our inference stack including kernels communication libraries and serving infrastructure to alternative hardware architectures like AMD.

About the Role
Were hiring engineers to scale and optimize OpenAIs inference infrastructure across emerging GPU platforms. Youll work across the stack from lowlevel kernel performance to highlevel distributed execution and collaborate closely with research infra and performance teams to ensure our largest models run smoothly on new hardware.

This is a highimpact opportunity to shape OpenAIs multiplatform inference capabilities from the ground up.

In this role you will:

Design and optimize highperformance GPU kernels for AMD accelerators using HIP Triton or other performancefocused frameworks.
Build and tune collective communication libraries (e.g. RCCL) used to parallelize model execution across many GPUs.
Integrate internal modelserving infrastructure (e.g. vLLM Triton) into AMDbacked systems.
Debug and optimize distributed inference workloads across memory network and compute layers.
Validate correctness performance and scalability of model execution on large AMD GPU clusters.

You can thrive in this role if you:

Have experience writing or porting GPU kernels using HIP CUDA or Triton and care deeply about lowlevel performance.
Are familiar with communication libraries like NCCL/RCCL and understand their role in highthroughput model serving.
Have worked on distributed inference systems and are comfortable scaling models across fleets of accelerators.
Enjoy solving endtoend performance challenges across hardware system libraries and orchestration layers.
Are excited to be part of a small fastmoving team building new infrastructure from first principles.

Nice to Have:

Contributions to opensource libraries like RCCL Triton or vLLM.
Experience with GPU performance tools (Nsight rocprof perf) and memory/comms profiling.
Prior experience deploying inference on AMD or other nonNVIDIA GPU environments.
Knowledge of model/tensor parallelism mixed precision and serving 10B parameter models.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that generalpurpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core and to achieve our mission we must encompass and value the many different perspectives voices and experiences that form the full spectrum of humanity.

We are an equal opportunity employer and do not discriminate on the basis of race religion national origin gender sexual orientation age veteran status disability or any other legally protected status.

OpenAI Affirmative Action and Equal Employment Opportunity Policy Statement

For US Based Candidates: Pursuant to the San Francisco Fair Chance Ordinance we will consider qualified applicants with arrest and conviction records.

We are committed to providing reasonable accommodations to applicants with disabilities and requests can be made via this link.

OpenAI Global Applicant Privacy Policy

At OpenAI we believe artificial intelligence has the potential to help people solve immense global challenges and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

Employment Type

Full-Time

Company Industry

Key Skills

Apply Now

About Company

OpenAI

Report This Job

Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.

Start Now

Dr.Job AutoApply

3X your job search with AutoApply's AI for faster dream job results.

Software Engineer, Inference â GPU Enablement

OpenAI

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Fullstack Software Engineer (Typescript)

Market Manager Ukraine â Full Responsibility to Drive Digital Sales

Market Manager Romania â Full Responsibility to Drive Digital Sales

Market Manager Czech Republic / Slovakia â Full Responsibility to Drive Digital Sales

Looking for a computer vision and ML engineer

Workforce Software Engineer - Comms

IT - SCDHHS - IT Healthcare Consultant - Software Engineer - Consultant

Splunk / Cribl Data Engineer __ Quincy, MA â Onsite _ Contract

Software Engineer, Inference â GPU Enablement

OpenAI

Job Description

Employment Type

Company Industry

Key Skills

About Company

Similar Jobs

Software Engineer, Inference â GPU Enablement