drjobs Kubernetes Engineer

Kubernetes Engineer

Employer Active

1 Vacancy
drjobs

Job Alert

You will be updated with latest job alerts via email
Valid email field required
Send jobs
Send me jobs like this
drjobs

Job Alert

You will be updated with latest job alerts via email

Valid email field required
Send jobs
Job Location drjobs

Dallas - USA

Monthly Salary drjobs

Not Disclosed

drjobs

Salary Not Disclosed

Vacancy

1 Vacancy

Job Description

Do you want to tackle the biggest questions in finance with near infinite compute power at your fingertips

G-Research is a leading quantitative research and technology firm with offices in London and Dallas.

We are proud to employ some of the best people in their field and to nurture their talent in a dynamic flexible and highly stimulating culture where world-beating ideas are cultivated and rewarded.

This is a hybrid role based in our new Dallas infrastructure hub where we work on the latest technologies in a cutting-edge environment.

The role

We are seeking a highly skilled Senior Kubernetes Engineer to join our Platform Engineering function in Dallas.

In this role you will design implement and optimise GPU-accelerated container platforms at scale enabling high-performance workloads (AI/ML HPC LLM training) across hybrid or on-prem environments.

You will have deep expertise with both NVIDIA and Kubernetes ecosystems including GPU scheduling device plugins and custom operators.

Key responsibilities of the role include:

  • Architecting and operating Kubernetes clusters optimised for GPU workloads leveraging NVIDIA GPU Operator Network Operator and DCGM

  • Developing deploying and maintaining custom Kubernetes operators and controllers to automate infrastructure services

  • Integrating NVIDIA device plugins Multi-Instance GPU (MIG) and GPU sharing features into the scheduling layer

  • Optimising GPU utilisation and job placement through scheduler extensions such as kube-scheduler plugins Slurm and Volcano

  • Collaborating with HPC ML and DevOps teams to ensure multi-tenant high-throughput cluster performance

  • Driving observability and telemetry integrations using Prometheus Grafana DCGM Exporter and OpenTelemetry

  • Implementing secure multi-user and multi-namespace GPU isolation with RBAC and policy enforcement such as OPA or Gatekeeper

  • Maintaining CI/CD pipelines for Kubernetes infrastructure using GitOps ArgoCD and FluxCD

  • Contributing to infrastructure-as-code using Terraform Helm and Kustomize

  • Participating in performance tuning incident response and production readiness reviews

Who are we looking for

The ideal candidate will have the following skills and experience

  • Extensive experience with Kubernetes in production-grade environments and working with NVIDIA and Kubernetes including GPU Operator device plugin NVML MIG and DCGM

  • Proficiency in Go or Python for operator development and Kubernetes controller logic

  • Deep understanding of Kubernetes internals including CRDs RBAC custom controllers and scheduler extensions

  • Experience with GPU-intensive workloads for example for LLMs training pipelines and scientific computing

  • Hands-on experience with Helm Kustomize and GitOps workflows

  • Familiarity with CNI plugins especially NVIDIA CNI and Multus

  • Experience with monitoring GPU metrics and cluster health using Prometheus and DCGM Exporter

The following is beneficial:

  • Knowledge of container runtimes with CRI-O containerd and NVIDIA Container Toolkit

  • Contributions to open-source projects in the Kubernetes or NVIDIA ecosystem

  • Preferred experience working with cilium or CNI plugins

Why should you apply

  • Market-leading compensation plus annual discretionary bonus

  • Lunch provided in the office (via GrubHub)

  • Informal dress code and excellent work/life balance

  • Excellent paid time off allowance of 25 days

  • Sick days military leave and family and medical leave

  • Generous 401(k) plan

  • 16-weeks fully paid parental leave

  • Medical and Prescription Dental and Vision insurance

  • Life and Accidental Death & Dismemberment (AD&D) insurance

  • Employee Assistance and Wellness programs

  • Generous relocation allowance and support

  • Great selection of office snacks and hot and cold drinks

  • On-site gym and car parking

This role is employed through our US affiliate.

G-Research is committed to cultivating and preserving an inclusive work environment. We are an ideas-driven business and we place great value on diversity of experience and opinions.

We want to ensure that applicants receive a recruitment experience that enables them to perform at their best. If you have a disability or special need that requires accommodation please let us know in the relevant section

Employment Type

Full-Time

Company Industry

About Company

Report This Job
Disclaimer: Drjobpro.com is only a platform that connects job seekers and employers. Applicants are advised to conduct their own independent research into the credentials of the prospective employer.We always make certain that our clients do not endorse any request for money payments, thus we advise against sharing any personal or bank-related information with any third party. If you suspect fraud or malpractice, please contact us via contact us page.