AI Engineer
Job Location:
Scottsdale, AZ - USA
Monthly Salary:
Not Disclosed
Posted on:
4 days ago
Vacancies:
1 Vacancy
Job Summary
We are seeking an experienced AIML Engineer to design build and operate AI/ML infrastructure and agentic systems. This role involves developing MCP servers and agents integrating LLMs and implementing RAG pipelines for production environments.
Key Responsibilities
- Design build and operate MCP servers and MCP agents that host orchestrate and monitor AI/agent workloads.
- Develop agentic AI prompt engineering patterns LLM integrations and developer tooling for production use.
- Own deployment scaling reliability and cost-efficiency on Kubernetes/Docker and Google Cloud with automated CI/CD
- Design and implement RAG (Retrieval Augmented Generation) pipelines and integrations with vector stores and retrieval tooling; use LangChain and Langfuse for orchestration chaining and observability.
Core Responsibilities
- Implement and maintain MCP server and agent code APIs and SDKs for model access and agent orchestration.
- Design agent behavior workflows and safety guards for agentic AI systems.
- Create test and iterate prompt templates evaluation harnesses and grounding/chain of thought strategies.
- Integrate LLMs and model providers (self hosted and cloud APIs) with unified adapters and telemetry.
- Build developer tooling: CLI local runner simulators and debugging tools for agents and prompts.
- Containerize services (Docker) manage orchestration (Kubernetes/GKE) and optimize nodes autoscaling and resource requests.
- Ensure observability: logging metrics traces dashboards alerting and SLOs for model infra and agents.
- Create runbooks playbooks and incident response procedures; reduce MTTR and perform postmortems.
- Design and maintain RAG workflows: document chunking embeddings vector indexing retrieval strategies re ranking and context injection.
- Integrate and instrument LangChain for composable chains agents and tooling; use Langfuse (or equivalent tracing) to capture prompts model calls RAG traces and evaluation telemetry.
Required Skills & Experience
- 5 years of Strong Software Engineering (Python/NodeJS) system design and production service experience.
- 2 years of Experience with LLMs prompt engineering and agent frameworks.
- 2 years of Experience Practical experience implementing RAG: embeddings vector DBs and retrieval tuning.
- 2 years of Experience with LangChain patterns and with toolchain telemetry (Langfuse or similar) for prompt/model traceability.
- 5 years of Experience with Kubernetes Docker CI/CD and infrastructure as code experience.
- 2 years of Experience with Practical experience with Google Cloud Platform services
- 2 years of Experience with Observability testing and security best practices for distributed systems.
- 2 years of Experience with evaluating and mitigating retrieval/augmentation failures hallucinations and leakage risks in RAG systems.
- Familiarity with vendor and open source vector stores and embedding providers.
- Familiarity with CI/CD pipelines (Jenkins GitHub Actions GitLab CI or ArgoCD).