We are seeking an experienced AI Systems Architect to design build and scale high-performance distributed AI systems. The ideal candidate will have deep expertise in GenAI LLMs and cloud-native architectures along with hands-on experience in building enterprise-scale AI/ML platforms and agent-based systems.
Must-Have Skills
Strong experience in designing and implementing high-performance large-scale distributed systems
Proven experience in implementing and deploying AI/ML platforms at scale
Expertise in building agent-based architectures evaluation frameworks and prompt/context engineering
Knowledge of MCP (Model Context Protocol) servers
Hands-on experience in LLM inference optimization including batching and caching strategies
Strong experience with Kubernetes and cloud infrastructure (AWS/Azure/GCP)
Proficiency in at least one programming language (Python Java Go etc.)
Expertise in designing agent data stacks & retrieval systems including:
Vector databases
Hybrid search
Data freshness strategies
Memory systems
Graph reasoning
BM25 and advanced retrieval techniques
Key Responsibilities
Architect and deliver scalable high-performance distributed systems
Design and deploy AI/ML and GenAI platforms at enterprise scale
Build and manage agent-based architectures including:
Prompt and context engineering
MCP servers
Evaluation frameworks
Optimize LLM inference pipelines for latency throughput and efficiency
Design and implement agent data & retrieval systems (vector DBs hybrid search memory graph-based reasoning)
Lead Kubernetes-based cloud-native deployments
Provide technical leadership architecture governance and hands-on mentoring to engineering teams
Nice to Have
Experience with RAG (Retrieval-Augmented Generation) frameworks
Familiarity with multi-agent systems and orchestration frameworks
Exposure to real-time data pipelines and streaming architectures
Job Title: AI Systems Architect Location: Dallas Charlotte San Francisco Bay Area Type of Hire C2C Job Summary We are seeking an experienced AI Systems Architect to design build and scale high-performance distributed AI systems. The ideal candidate will have deep expertise in GenAI LLMs ...
Job Title: AI Systems Architect
Location: Dallas Charlotte San Francisco Bay Area
Type of Hire C2C
Job Summary
We are seeking an experienced AI Systems Architect to design build and scale high-performance distributed AI systems. The ideal candidate will have deep expertise in GenAI LLMs and cloud-native architectures along with hands-on experience in building enterprise-scale AI/ML platforms and agent-based systems.
Must-Have Skills
Strong experience in designing and implementing high-performance large-scale distributed systems
Proven experience in implementing and deploying AI/ML platforms at scale
Expertise in building agent-based architectures evaluation frameworks and prompt/context engineering
Knowledge of MCP (Model Context Protocol) servers
Hands-on experience in LLM inference optimization including batching and caching strategies
Strong experience with Kubernetes and cloud infrastructure (AWS/Azure/GCP)
Proficiency in at least one programming language (Python Java Go etc.)
Expertise in designing agent data stacks & retrieval systems including:
Vector databases
Hybrid search
Data freshness strategies
Memory systems
Graph reasoning
BM25 and advanced retrieval techniques
Key Responsibilities
Architect and deliver scalable high-performance distributed systems
Design and deploy AI/ML and GenAI platforms at enterprise scale
Build and manage agent-based architectures including:
Prompt and context engineering
MCP servers
Evaluation frameworks
Optimize LLM inference pipelines for latency throughput and efficiency
Design and implement agent data & retrieval systems (vector DBs hybrid search memory graph-based reasoning)
Lead Kubernetes-based cloud-native deployments
Provide technical leadership architecture governance and hands-on mentoring to engineering teams
Nice to Have
Experience with RAG (Retrieval-Augmented Generation) frameworks
Familiarity with multi-agent systems and orchestration frameworks
Exposure to real-time data pipelines and streaming architectures