AIML Engineer

Not Interested
Bookmark
Report This Job

profile Job Location:

Burlingame, CA - USA

profile Monthly Salary: Not Disclosed
Posted on: 7 hours ago
Vacancies: 1 Vacancy

Job Summary

K&K Global Talent Solutions Inc. is an international recruiting agency that has been providing technical resources in the Canada and the USA region since 1993.

This position is with one of our clients in USA who is actively hiring candidates to expand their teams.


Role: AI/ML Engineer

Location: Burlingame CA (Hybrid)

About the Role

We are seeking an experienced AI/ML Engineer to build scale and maintain the critical infrastructure that powers our AI models and autonomous this role you will act as the bridge between our AI research/development teams and our production environments. You will not just be deploying models; you will be designing the high-performance distributed systems required to serve Large Language Models (LLMs) orchestrate multi-agent workflows and optimize GPU compute at scale.

If you are passionate about turning complex AI capabilities into highly reliable scalable and cost-efficient production systems this is the role for you.

Key Responsibilities

1. Machine Learning Infrastructure & Serving

  • Design build and manage scalable infrastructure for training fine-tuning and serving LLMs and multimodal models.
  • Optimize inference latency throughput and cost using modern serving frameworks (e.g. vLLM Triton Inference Server Ray Serve) 2.
  • Manage and orchestrate GPU/TPU clusters ensuring high utilization and efficient resource allocation.

2. Building and Scaling Agentic Operations (AgentOps)

  • Architect and deploy infrastructure to support autonomous AI agents and multi-agent systems.
  • Integrate and maintain agent orchestration frameworks (e.g. LangGraph CrewAI) within production environments 3.
  • Build robust state management and memory systems (vector databases graph databases) required for agentic workflows.

3. Observability Evaluation and Reliability

  • Implement comprehensive observability stacks tailored for LLMs and agents (tracing prompt logging cost tracking) using tools like Langfuse Arize or Datadog 4.
  • Design automated evaluation pipelines to monitor agent performance safety and reliability in real-time (LLMOps/AgentOps).
  • Act as the first line of defense for production AI systems diagnosing and resolving issues related to memory limits inference queues and cluster failures.

4. Developer Platform & CI/CD for AI

  • Build internal developer platforms and tooling that allow AI engineers and data scientists to easily deploy models and agents to production.
  • Adapt traditional CI/CD pipelines to accommodate model versioning prompt management and continuous evaluation.

Qualifications

Required Skills:

  • Systems Engineering: Strong background in distributed systems backend engineering or DevOps/SRE.
  • Programming: Proficiency in Python (essential for the AI ecosystem) and systems languages like Go or Rust.
  • Containerization & Orchestration: Deep expertise in Kubernetes (K8s) Docker and infrastructure-as-code (Terraform Pulumi).
  • AI/ML Tooling: Hands-on experience with LLM serving engines (vLLM TGI Triton) and distributed computing frameworks (Ray) 2.
  • Agent Frameworks: Familiarity with modern agentic development frameworks like LangChain LangGraph or CrewAI 3.
  • Cloud & Hardware: Experience managing high-performance compute (GPUs/TPUs) on major cloud providers (AWS GCP Azure)

Preferred Skills:

  • Experience with vector databases (Pinecone Milvus Qdrant) and retrieval-augmented generation (RAG) pipelines.
  • Understanding of model optimization techniques (quantization LoRA KV caching).
  • Previous experience building platforms from the ground up in a high-growth environment.
K&K Global Talent Solutions Inc. is an international recruiting agency that has been providing technical resources in the Canada and the USA region since 1993. This position is with one of our clients in USA who is actively hiring candidates to expand their teams. Role: AI/ML Engineer Location: B...
View more view more