Job Title: Application Management Specialist
Location: Dallas TX or New York City NY (Hybrid)
Type: Contract
Job Duties:
- Build agentic AI systems: Design and implement tool-calling agents that combine retrieval structured reasoning and secure action execution (function calling change orchestration policy enforcement) following MCP protocol. Engineer robust guardrails for safety compliance and least-privilege access.
- Productionize LLMs: Build evaluation framework for open-source and foundational LLMs; implement retrieval pipelines prompt synthesis response validation and self-correction loops tailored to production operations.
- Integrate with runtime ecosystems: Connect agents to observability incident management and deployment systems to enable automated diagnostics runbook execution remediation and post-incident summarization with full traceability.
- Collaborate directly with users: Partner with production engineers and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability risk reduction and cost; and deliver auditable business-aligned outcomes.
- Safety reliability and governance: Build validator models adversarial prompts and policy checks into the stack; enforce deterministic fallbacks circuit breakers and rollback strategies; instrument continuous evaluations for usefulness correctness and risk.
- Scale and performance: Optimize cost and latency via prompt engineering context management caching model routing and distillation; leverage batching streaming and parallel tool-calls to meet stringent SLOs under real-world load.
- Build a RAG pipeline: Curate domain-knowledge; build data-quality validation framework; establish feedback loops and milestone framework maintain knowledge freshness.
- Raise the bar: Drive design reviews experiment rigor and high-quality engineering practices; mentor peers on agent architectures evaluation methodologies and safe deployment patterns
Role Requirements:
- Understand what skills experience and qualities you are looking for.
ESSENTIAL SKILLS
- 5 years of software development in one or more languages (Python C/C Go Java); strong hands-on experience building and maintaining large-scale Python applications preferred.
- 3 years designing architecting testing and launching production ML systems including model deployment/serving evaluation and monitoring data processing pipelines and model fine-tuning workflows.
- Practical experience with Large Language Models (LLMs): API integration prompt engineering fine-tuning/adaptation and building applications using RAG and tool-using agents (vector retrieval function calling secure tool execution).
- Understanding of different LLMs both commercial and open source and their capabilities (e.g. OpenAI Gemini Llama Qwen Claude).
- Solid grasp of applied statistics core ML concepts algorithms and data structures to deliver efficient and reliable solutions.
- Strong analytical problem-solving ownership and urgency; ability to communicate complex ideas simply and collaborate effectively across global teams with a focus on measurable business impact.
Preferred:
- Proficiency building and operating on cloud infrastructure (ideally AWS) including containerized services (ECS/EKS) serverless (Lambda) data services (S3 DynamoDB Redshift) orchestration (Step Functions) model serving (SageMaker) and infra-as-code (Terraform/CloudFormation).
Job Title: Application Management Specialist Location: Dallas TX or New York City NY (Hybrid) Type: Contract Job Duties: Build agentic AI systems: Design and implement tool-calling agents that combine retrieval structured reasoning and secure action execution (function calling change orchestration ...
Job Title: Application Management Specialist
Location: Dallas TX or New York City NY (Hybrid)
Type: Contract
Job Duties:
- Build agentic AI systems: Design and implement tool-calling agents that combine retrieval structured reasoning and secure action execution (function calling change orchestration policy enforcement) following MCP protocol. Engineer robust guardrails for safety compliance and least-privilege access.
- Productionize LLMs: Build evaluation framework for open-source and foundational LLMs; implement retrieval pipelines prompt synthesis response validation and self-correction loops tailored to production operations.
- Integrate with runtime ecosystems: Connect agents to observability incident management and deployment systems to enable automated diagnostics runbook execution remediation and post-incident summarization with full traceability.
- Collaborate directly with users: Partner with production engineers and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability risk reduction and cost; and deliver auditable business-aligned outcomes.
- Safety reliability and governance: Build validator models adversarial prompts and policy checks into the stack; enforce deterministic fallbacks circuit breakers and rollback strategies; instrument continuous evaluations for usefulness correctness and risk.
- Scale and performance: Optimize cost and latency via prompt engineering context management caching model routing and distillation; leverage batching streaming and parallel tool-calls to meet stringent SLOs under real-world load.
- Build a RAG pipeline: Curate domain-knowledge; build data-quality validation framework; establish feedback loops and milestone framework maintain knowledge freshness.
- Raise the bar: Drive design reviews experiment rigor and high-quality engineering practices; mentor peers on agent architectures evaluation methodologies and safe deployment patterns
Role Requirements:
- Understand what skills experience and qualities you are looking for.
ESSENTIAL SKILLS
- 5 years of software development in one or more languages (Python C/C Go Java); strong hands-on experience building and maintaining large-scale Python applications preferred.
- 3 years designing architecting testing and launching production ML systems including model deployment/serving evaluation and monitoring data processing pipelines and model fine-tuning workflows.
- Practical experience with Large Language Models (LLMs): API integration prompt engineering fine-tuning/adaptation and building applications using RAG and tool-using agents (vector retrieval function calling secure tool execution).
- Understanding of different LLMs both commercial and open source and their capabilities (e.g. OpenAI Gemini Llama Qwen Claude).
- Solid grasp of applied statistics core ML concepts algorithms and data structures to deliver efficient and reliable solutions.
- Strong analytical problem-solving ownership and urgency; ability to communicate complex ideas simply and collaborate effectively across global teams with a focus on measurable business impact.
Preferred:
- Proficiency building and operating on cloud infrastructure (ideally AWS) including containerized services (ECS/EKS) serverless (Lambda) data services (S3 DynamoDB Redshift) orchestration (Step Functions) model serving (SageMaker) and infra-as-code (Terraform/CloudFormation).
View more
View less