Senior AWS Agentcore Platform Engineer

Mprogen


Job Location:

Reading, PA - USA

Monthly Salary: Not Disclosed
Posted on: 4 hours ago
Vacancies: 1 Vacancy

Job Summary

Role: Senior AWS Agentcore Platform Engineer

Location: Reading PA(Hybrid 2-3 days a week from office)

Position Type: CTH

We are looking for a highly technical Lead Platform Engineer to architect the observability cost governance and security framework for our enterprise AI agent ecosystem. You will be responsible for ensuring our agentic workflowsbuilt on AWS Bedrock AgentCoreand MCP serversare scalable observable and cost-efficient.

The ideal candidate bridges the gap between traditional DevOps and the emerging world of LLMOps with a deep focus on distributed tracing for non-deterministic AI workloads.

Requirements

Experience: 8 years in Platform Engineering DevOps or Site Reliability Engineering (SRE).

Cloud Expertise: Deep proficiency in AWS (IAM CloudWatch Lambda).

Observability Tools: Proven experience with Dynatrace Jaeger or Honeycomb and distributed tracing standards.

AI/LLM Interest: Familiarity with the LLM lifecycle including prompt execution token usage and frameworks like LangChain or AgentCore.

Automation: Advanced experience with Terraform and CI/CD pipeline design.

Collaboration: Experience working in an Agile environment with integrated tools like Microsoft Teams and Confluence.

Job Responsibilities

  • Observability
  • Assess CloudWatch X-Ray Bedrock logging AgentCore traces vs. agentic workflow requirements; produce gap analysis Setup observability in Dynatrace
  • Design post-deployment validation pipeline for agents & MCP servers (deployment health tool registration checks)
  • Implement distributed tracing & structured logging: LLM decisions tool selections sub-agent calls MCP interactions
  • Evaluate LangFuse / LiteLLM proxy vs. AWS-native; deliver target-state observability architecture recommendation
  • Cost Tracking & TCO
  • Extend tagging taxonomy to cover agent runtimes MCP servers vector DBs Bedrock token consumption per namespace
  • Design cost visibility model: aggregate agent MCP vector DB and Bedrock token costs per team/department
  • Build CloudWatch (or equivalent) dashboards for per-team spend; configure AWS Budgets with alerting thresholds
  • Automate cost reportsdelivered via email / Microsoft Teams; implement anomaly detection rules
  • Monitoring & Alerting
  • Define P1-P4 alerting rules: deployment failures runtime errors tool invocation failures MCP connectivity issues
  • Integrate alert notifications to Microsoft Teams channels and email; route by resource ownership tags
  • Author runbooks linked to every alert; publish in Confluence for developer self-service resolution
  • Evaluate AWS-native vs. third-party monitoring stack; deliver recommendation aligned to observability architecture
  • Security & Access Control
  • Assess current IAM tagging approach for multi-team isolation; identify scalability gaps and risks
  • Evaluate Cedar policy engine (AgentCore) for fine-grained tool access control; document enterprise-scale gaps
  • Design scalable ABAC-based identity model for multi-team isolation without IAM policy sprawl; deliver Terraform modules


Required Skills:

AWS

Role: Senior AWS Agentcore Platform Engineer Location: Reading PA(Hybrid 2-3 days a week from office) Position Type: CTH We are looking for a highly technical Lead Platform Engineer to architect the observability cost governance and security framework for our enterprise AI agent ecosystem. You will...