AI Application Engineer
Job Location:
Santa Clara County, CA - USA
Monthly Salary:
Not Disclosed
Posted on:
22 days ago
Vacancies:
1 Vacancy
Job Summary
Role: AI Application Engineer
Location: Santa Clara CA (Hybrid)
Type: Contract
Overview:
AI Application Engineer to support the development and delivery of next-generation AI-powered applications built on NVIDIA infrastructure. This role will focus on production-grade LLM application engineering RAG quality prompt engineering AI safety and orchestration of complex multi-step AI pipelines.
Day-to-Day Responsibilities
- Design develop and optimize production-grade LLM-powered applications
- Own AI quality RAG accuracy prompt engineering and AI safety across multiple applications
- Develop and maintain multi-step LLM orchestration pipelines using LangChain LlamaIndex or custom frameworks
- Implement and optimize RAG pipelines including chunking strategies embedding selection reranking and hybrid search
- Design multi-turn conversational AI experiences with context management and session memory
- Integrate NVIDIA technologies including NIM NeMo NeMoGuardrails and Riva into enterprise AI applications
- Build automated evaluation pipelines for model quality hallucination detection regression testing and release gating
- Perform latency profiling and optimization across multi-step LLM call chains
- Implement AI safety guardrails including prompt injection prevention jailbreak mitigation and topical control
- Collaborate with globally distributed engineering and product teams to deliver scalable AI solutions
- Support deployment monitoring and continuous improvement of AI applications in production environments
Basic Qualifications:
- 4 7 years of software engineering experience with at least 2 years focused on production LLM application development
- Expert-level experience with Python for AI/ML application development and async programming
- Strong expertise in prompt engineering including system prompts few-shot prompting and instruction tuning
- 3 Years of Hands-on experience with multi-step LLM orchestration frameworks such as LangChain or LlamaIndex
- 3 Years of Experience designing and optimizing RAG pipelines and retrieval systems
- 3 Years of Experience with vector databases similarity search tuning and reranking techniques
- 3 Years of Hands-on experience with NVIDIA NIM NeMo NeMoGuardrails and Riva
- 3 Years of Experience implementing AI safety and guardrails for customer-facing applications
- Strong knowledge of automated AI evaluation frameworks such as RAGAS or TruLens
- 3 Years of Experience profiling and optimizing latency in multi-step AI pipelines
- Ability to work onsite in Santa Clara CA
- Preferred Qualifications
- Experience with adaptive learning systems or recommendation engines
- Knowledge graph integration experience with RAG architectures
- Experience with multi-agent orchestration patterns
- ServiceNow API integration experience
- Prior experience building AI products on NVIDIA infrastructure
- Experience with streaming LLM response handling and real-time AI applications
Technology Stack
- Python
- LangChain
- LlamaIndex
- NVIDIA NIM
- NeMo
- NeMoGuardrails
- NVIDIA Riva
- Vector Databases
- RAGAS / TruLens
- LLM APIs and orchestration frameworks
Education
- Bachelors degree in Computer Science Engineering Artificial Intelligence or equivalent work experience.