Cloud Infrastructure Engineer

Key2Source

Not Interested
Bookmark
Report This Job

profile Job Location:

Charlotte, VT - USA

profile Monthly Salary: Not Disclosed
Posted on: 1 hour ago
Vacancies: 1 Vacancy

Job Summary

Job Title: Cloud Infrastructure Engineer

Location: Charlotte NC (5 Days onsite)

Duration: 12 months

Primary Skills

  • vLLM
  • TensorRT-LLM
  • Triton Inference Server
  • SGLang
  • Kubernetes ML Serving
  • KServe
  • OpenShift AI
  • GPU Orchestration
  • GCP
  • Terraform

Key Responsibilities

  • Design and manage scalable AI/ML infrastructure for GenAI and LLM workloads.
  • Deploy and optimize LLM inference pipelines using vLLM TensorRT-LLM Triton Inference Server and SGLang.
  • Implement inference optimization techniques including:
  1. Continuous Batching
  2. Speculative Decoding
  3. KV Cache / Prefix Caching
  4. FP8 / AWQ / GPTQ quantization
  5. Tensor Parallelism
  • Build and maintain Kubernetes-based ML serving platforms using KServe and OpenShift AI.
  • Manage GPU orchestration and scheduling using technologies such as Run:AI CUDA NCCL and MIG.
  • Develop Helm charts Kubernetes Operators and platform automation for AI workloads.
  • Conduct performance benchmarking and optimization for GPU-based inference systems.
  • Implement monitoring and observability using Prometheus and Grafana.
  • Collaborate with data science and ML engineering teams to productionize LLM models.
  • Automate infrastructure provisioning and deployment using Terraform.

Required Qualifications

  • 6 years of experience in cloud engineering or platform engineering.
  • Experience with LLMOps/MLOps platforms.
  • Strong hands-on experience with Kubernetes and containerized AI/ML workloads.
  • Experience with GPU infrastructure and distributed inference optimization.
  • Proficiency in GCP cloud services and cloud-native architecture.
  • Strong scripting/programming skills in Python.
  • Experience with ML observability and production monitoring tools.
  • Familiarity with OpenShift AI and enterprise Kubernetes ecosystems.

Preferred Qualifications

  • Knowledge of GenAI frameworks and RAG architectures.
  • Exposure to enterprise AI governance and security practices.
Job Title: Cloud Infrastructure Engineer Location: Charlotte NC (5 Days onsite) Duration: 12 months Primary Skills vLLM TensorRT-LLM Triton Inference Server SGLang Kubernetes ML Serving KServe OpenShift AI GPU Orchestration GCP Terraform Key Responsibilities Design and manage scalable AI/ML infra...
View more view more