AI Engineer

Connvertex Technologies Inc.

Job Location:

Scottsdale, AZ - USA

Monthly Salary: Not Disclosed

Posted on: 4 days ago

Vacancies: 1 Vacancy

Job Summary

We are seeking an experienced AIML Engineer to design build and operate AI/ML infrastructure and agentic systems. This role involves developing MCP servers and agents integrating LLMs and implementing RAG pipelines for production environments.

Key Responsibilities

Design build and operate MCP servers and MCP agents that host orchestrate and monitor AI/agent workloads.
Develop agentic AI prompt engineering patterns LLM integrations and developer tooling for production use.
Own deployment scaling reliability and cost-efficiency on Kubernetes/Docker and Google Cloud with automated CI/CD
Design and implement RAG (Retrieval Augmented Generation) pipelines and integrations with vector stores and retrieval tooling; use LangChain and Langfuse for orchestration chaining and observability.

Core Responsibilities

Implement and maintain MCP server and agent code APIs and SDKs for model access and agent orchestration.
Design agent behavior workflows and safety guards for agentic AI systems.
Create test and iterate prompt templates evaluation harnesses and grounding/chain of thought strategies.
Integrate LLMs and model providers (self hosted and cloud APIs) with unified adapters and telemetry.
Build developer tooling: CLI local runner simulators and debugging tools for agents and prompts.
Containerize services (Docker) manage orchestration (Kubernetes/GKE) and optimize nodes autoscaling and resource requests.
Ensure observability: logging metrics traces dashboards alerting and SLOs for model infra and agents.
Create runbooks playbooks and incident response procedures; reduce MTTR and perform postmortems.
Design and maintain RAG workflows: document chunking embeddings vector indexing retrieval strategies re ranking and context injection.
Integrate and instrument LangChain for composable chains agents and tooling; use Langfuse (or equivalent tracing) to capture prompts model calls RAG traces and evaluation telemetry.

Required Skills & Experience

5 years of Strong Software Engineering (Python/NodeJS) system design and production service experience.
2 years of Experience with LLMs prompt engineering and agent frameworks.
2 years of Experience Practical experience implementing RAG: embeddings vector DBs and retrieval tuning.
2 years of Experience with LangChain patterns and with toolchain telemetry (Langfuse or similar) for prompt/model traceability.
5 years of Experience with Kubernetes Docker CI/CD and infrastructure as code experience.
2 years of Experience with Practical experience with Google Cloud Platform services
2 years of Experience with Observability testing and security best practices for distributed systems.
2 years of Experience with evaluating and mitigating retrieval/augmentation failures hallucinations and leakage risks in RAG systems.
Familiarity with vendor and open source vector stores and embedding providers.
Familiarity with CI/CD pipelines (Jenkins GitHub Actions GitLab CI or ArgoCD).

Key Responsibilities

Design build and operate MCP servers and MCP agents that host orchestrate and monitor AI/agent workloads.
Develop agentic AI prompt engineering patterns LLM integrations and developer tooling for production use.
Own deployment scaling reliability and cost-efficiency on Kubernetes/Docker and Google Cloud with automated CI/CD
Design and implement RAG (Retrieval Augmented Generation) pipelines and integrations with vector stores and retrieval tooling; use LangChain and Langfuse for orchestration chaining and observability.

Core Responsibilities

Implement and maintain MCP server and agent code APIs and SDKs for model access and agent orchestration.
Design agent behavior workflows and safety guards for agentic AI systems.
Create test and iterate prompt templates evaluation harnesses and grounding/chain of thought strategies.
Integrate LLMs and model providers (self hosted and cloud APIs) with unified adapters and telemetry.
Build developer tooling: CLI local runner simulators and debugging tools for agents and prompts.
Containerize services (Docker) manage orchestration (Kubernetes/GKE) and optimize nodes autoscaling and resource requests.
Ensure observability: logging metrics traces dashboards alerting and SLOs for model infra and agents.
Create runbooks playbooks and incident response procedures; reduce MTTR and perform postmortems.
Design and maintain RAG workflows: document chunking embeddings vector indexing retrieval strategies re ranking and context injection.
Integrate and instrument LangChain for composable chains agents and tooling; use Langfuse (or equivalent tracing) to capture prompts model calls RAG traces and evaluation telemetry.

Required Skills & Experience

5 years of Strong Software Engineering (Python/NodeJS) system design and production service experience.
2 years of Experience with LLMs prompt engineering and agent frameworks.
2 years of Experience Practical experience implementing RAG: embeddings vector DBs and retrieval tuning.
2 years of Experience with LangChain patterns and with toolchain telemetry (Langfuse or similar) for prompt/model traceability.
5 years of Experience with Kubernetes Docker CI/CD and infrastructure as code experience.
2 years of Experience with Practical experience with Google Cloud Platform services
2 years of Experience with Observability testing and security best practices for distributed systems.
2 years of Experience with evaluating and mitigating retrieval/augmentation failures hallucinations and leakage risks in RAG systems.
Familiarity with vendor and open source vector stores and embedding providers.
Familiarity with CI/CD pipelines (Jenkins GitHub Actions GitLab CI or ArgoCD).