Staff Machine Learning Engineer, AI Agent Platform
New York City, NY - USA
Job Summary
Why Join GEICO
At GEICO we offer a rewarding career where your ambitions are met with endless possibilities.
Every day we honor our iconic brand by offering quality coverage to millions of customers and being there when they need us most. We thrive on relentless innovation to exceed our customers expectations while making a real impact on local communities nationwide.
Founded in 1936 GEICO is a member of the Berkshire Hathaway family of companies and one of the largest auto insurers in the United States. When you join our company we want you to feel valued supported and proud to work here. Thats why we offer the GEICO Pledge: Great Company Great Culture Great Rewards and Great Careers.
Staff Machine Learning Engineer AI Agent Platform
The GEICO AI Agent Platform team is seeking an exceptional Staff ML Engineer to build the next generation enterprise AI Agent OS and SDKs. You will design implement and maintain scalable backend systems that enable business product and engineering teams to build test and deploy their own AI agents & 2026 the agentic AI landscape is maturing rapidly with standardized protocols (MCP A2A) AI agent skill ecosystems harness engineering context engineering and governance-first design becoming table stakes. You will help GEICO stay at the forefront. The candidate must have excellent communication skills and a proven track record of delivering business value via technical excellence.
Key Responsibilities
Platform Engineering
- Architect scalable multi-tenant backend systems for AI agent workflows including AI agent configuration evaluation synthetic data generation workflow simulation & evaluation MCP server registry A2A communication infrastructure and guardrail enforcement layers using AKS FastAPI etc.
- Build an enterprise AI agent skill ecosystem a platform for authoring publishing discovering versioning and governing reusable skill packages that encode domain expertise into portable modules. Implement an internal skill marketplace with search/discovery quality scoring security vetting pipelines approval workflows and progressive disclosure loading.
- Implement production-grade AI agent harnesses the non-model infrastructure (tool dispatch context management error recovery/self-healing session state sub-agent coordination) that makes AI agents reliable for long-running tasks. Design feedforward guides (linters type checkers architecture constraints) and feedback sensors (test execution LLM-as-judge semantic analysis) mixing computational and inferential controls.
- Build and optimize context engineering systems memory hierarchies (short-term working long-term) RAG pipelines scratchpads context compaction/summarization and dynamic skill/tool loading ensuring AI agents receive the right information at the right time while minimizing token waste.
- Develop observability frameworks (OpenTelemetry distributed tracing) with LLM-specific telemetry: token usage latency profiling hallucination detection AI agent behavior auditing and skill execution monitoring.
AI Safety Governance & Guardrails
- Design layered guardrail architectures (input validation prompt injection defense PII detection output verification) with parallelized enforcement for minimal latency impact.
- Implement skill-level governance: security vetting for hidden payloads credential theft and data exfiltration risks; authoring standards; conflict resolution; version management; and deprecation workflows.
Technical Leadership
- Act as tech lead for a sub-team setting direction and ensuring consistency in design principles. Provide hands-on mentorship during design reviews code assessments and performance tuning.
- Establish engineering standards for ML infrastructure harness engineering patterns skill authoring and deployment practices. Create documentation runbooks and training on platform capabilities.
- Collaborate cross-functionally with data scientists engineers and product teams. Translate complex technical concepts for diverse stakeholders.
Qualifications
Technical Skills
- Bachelors in CS Engineering or related field; advanced degree highly desirable.
- 6 years designing implementing and maintaining multi-tenant AI/ML systems in production.
- 6 years with cloud platforms (Azure AWS) and backend systems (Kubernetes Temporal OpenSearch PostgreSQL Redis Neo4j). Deep understanding of Docker Prometheus and OpenTelemetry.
- Deep proficiency in Python Java or Go. Extra credit for effectively leveraging AI coding tools (Cursor Claude Code GitHub Copilot).
- Proficiency in AI/ML and agentic frameworks (TensorFlow PyTorch LangGraph CrewAI AutoGen).
Leadership Skills
- Demonstrated track record mentoring engineers and leading technical initiatives.
- Excellent communication across diverse seniority levels and professional backgrounds.
Preferred Specialized Skills
- Experience with harness engineering concepts and practices such as tool dispatch error recovery session state permissions sub-agent coordination planning & reasoning w. feedback loops etc..
- Experience designing AI agent skill systems reusable capability packages skill registries/marketplaces with discovery versioning security vetting and governance controls.
- Hands-on experience with MCP (server development registries) and A2A (AI agent card discovery task delegation).
- Experience with LLM observability (LangSmith Langfuse Arize Phoenix) and guardrail systems (prompt injection defense PII scanning skill-level security auditing).
- Experience with multi-agent orchestration both open-source (Llama Qwen Mistral) and proprietary (GPT Claude) LLMs and no-code/low-code AI agent development environments.
If you are passionate about pushing the boundaries of generative AI platforms thrive in a hands-on technical leadership role and enjoy solving complex large-scale problems we encourage you to apply.
Annual Salary
$115000.00 - $260000.00The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include but are not limited to the scope and responsibilities of the role the selected candidates work experience education and training the work location as well as market and business considerations.
The GEICO Pledge:
Great Company:Protecting customers through lifes twists and turns with innovation and integrity.
Great Careers:Personalized development programs mentorship and certification assistance.
Great Culture:Inclusive and collaborative culture rooted in shared success.
Great Rewards:Competitive pay benefits and flexibility to support your well-being and future.
The equal employment opportunity policy of the GEICO Companies provides for a fair and equal employment opportunity for all associates and job applicants regardless of race color religious creed national origin ancestry age gender pregnancy sexual orientation gender identity marital status familial status disability or genetic information in compliance with applicable federal state and local law. GEICO hires and promotes individuals solely on the basis of their qualifications for the job to be filled.
GEICO reasonably accommodates qualified individuals with disabilities to enable them to receive equal employment opportunity and/or perform the essential functions of the job unless the accommodation would impose an undue hardship to the Company. This applies to all applicants and associates. GEICO also provides a work environment in which each associate is able to be productive and work to the best of their ability. We do not condone or tolerate an atmosphere of intimidation or harassment. We expect and require the cooperation of all associates in maintaining an atmosphere free from discrimination and harassment with mutual respect by and for all associates and applicants.
Required Experience:
Staff IC
About Company
Get insurance from a company that's been trusted since 1936. See how much you can save with GEICO on insurance for your car, motorcycle, and more.