Principal Engineer, AI Platform
Job Summary
WHAT MAKES US EPIC
At the core of Epics success are talented passionate people. Epic prides itself on creating a collaborative welcoming and creative environment. Whether its building award-winning games or crafting engine technology that enables others to make visually stunning interactive experiences were always innovating.
Being Epic means being a part of a team that continually strives to do right by our community and users. Were constantly innovating to raise the bar of engine and game development.
ONLINE INFRASTRUCTURE
What We Do
We enable Epics online services teams to build deploy and manage services that are used by more than half a billion players around the world. Our mission is to provide world class tools and platforms to improve the experience of our developers and make it easier faster and safer to build operate and scale their applications. We operate at massive scale as one of the largest cloud computing users in the world.
What Youll Do
Epic Games is the creator of Fortnite Unreal Engine and the Epic Games Store a company that has shaped the way billions of people play create and connect. We build the foundational technology that powers some of the largest interactive experiences on the planet and we give those tools to creators and developers around the world through open ecosystems.
Behind the products is an engineering organization that operates at rare scale: real-time infrastructure for hundreds of millions of players a game engine used across film architecture and automotive industries and a platform ecosystem that lets indie developers ship games to every major device on Earth.
Our AI Platform team is building the next layer of that infrastructure an enterprise-grade stack of agentic AI systems that automates engineering workflows accelerates developer productivity and enables new kinds of collaboration across Epics teams. Were not a research group and were not deploying off-the-shelf tools. Were architecting and building production systems from the ground up across six interconnected platforms:
- AI Agent Orchestration multi-tenant platform for team AI agents that live and collaborate in Slack channels Source Control etc.
- EMA (Epic Managed Agents) compute and workspace infrastructure for headless agent harness runs at scale
- AI MCP Gateway MCP OAuth gateway plugin runtime and governance layer for AI tool orchestration
- Non Human Identity Management agent identity credential vault and authorization for non-human workloads
- Centralized AI Knowledge Base org-wide memory plane with knowledge graph deductive reasoning and hierarchical summarization
- Roost cryptographically signed software distribution and the Claude Code plugin marketplace
This is foundational work that will define how AI is used inside Epic for the next decade. The scale is real the problems are hard and the team is small enough that every engineer makes a decisive architectural impact.
As a Principal Engineer on the AI Platform team youll own the technical direction of our agent infrastructure stack end to end. Youll set the architecture across the six platforms above drive alignment between them and personally solve the hardest distributed systems and security problems that emerge as the stack scales. Youll work across teams to ensure agent identity tool governance memory and execution infrastructure are coherent secure and operable and youll mentor the engineers who build alongside you. This isnt a coordinator role. Youll write production code design protocols make the calls that determine how agents authenticate and what theyre allowed to do and be accountable for the reliability of systems that are actively used by Epics engineering organization.
In this role you will
- Platform Architecture & Technical Leadership:
- Own the end-to-end technical architecture across Epics AI Infrastructure Platforms ensuring each platform is coherent with the others and that the integration seams are well-defined
- Drive architectural decisions for agent identity and workload authorization (SPIFFE/SPIRE OIDC token exchange policy planes) translating security requirements into implementable designs
- Establish the patterns for how AI agents authenticate receive credentials execute tools and are audited and hold the bar for correctness across the stack
- Lead design reviews for new capabilities evaluate build vs. buy decisions and surface technical risk before it becomes production risk
- Distributed Systems & Infrastructure:
- Design and implement the Cluster API and provider abstractions for EMA the layer that orchestrators depend on to launch drive and recover headless agent runs across Kubernetes EC2 and other compute backends
- Evolve Epics AI MCP Gateway plugin runtime (WASM gRPC sidecar subprocess multiplexer) and its gateway security posture as external tool surface area grows
- Architect Epics knowledge graph vector search and memory consolidation pipeline for org-wide scale across teams scopes and retention horizons
- Define durability consistency and isolation requirements across event-driven architectures (NATS JetStream Redis) shared by multiple agent platforms
- Security & Trust:
- Lead the AI NHI Identity proposal from strategy into staffed execution defining the separation of AI MCP Gateway (human/tool) and (agent identity) and migrating the existing credential vault
- Hold the standard for credential security across the stack: AES-256-GCM vault AAD binding scope isolation default-deny policy and audit completeness
- Work with Epics security organization to ensure agent-to-service trust models meet enterprise standards
- Cross-Team Influence & Mentorship:
- Partner with product ML and enterprise platform teams to shape how agent capabilities are exposed to Epics broader engineering organization
- Mentor senior and staff engineers across the team; conduct technical interviews and raise the hiring bar
- Write design documents that become the reference architecture for future work not just approvals for current work
What were looking for
- 12 years of software engineering experience with at least 4 years at staff or principal scope
- Deep expertise in distributed systems: event-driven architectures durable execution service mesh and multi-tenant platform design
- Production experience with authentication and authorization infrastructure OAuth 2.0 OIDC SPIFFE/SPIRE or equivalent workload identity token exchange (RFC 8693) and policy engines (OPA OpenFGA or comparable)
- Strong security engineering fundamentals: credential vaulting secrets management (OpenBao/Vault) audit trail design and least-privilege access at scale
- Fluency in at least one compiled systems-capable language (Go preferred Rust or C acceptable); comfort reading and writing Go microservices is essential given the stack
- Track record of owning multi-service platform architecture across a full product lifecycle from design through sustained production operation
- Exceptional written communication: design documents and architecture reviews that are clear precise and influence without authority
- Hands-on experience building LLM-integrated systems: agent orchestration tool-use frameworks MCP (Model Context Protocol) or equivalent agent-to-tool middleware
- Experience with plugin or extension runtime design WASM sandboxing gRPC sidecar patterns subprocess isolation or comparable capability security models
- Familiarity with knowledge graph systems (Neo4j or comparable) vector databases and hybrid retrieval (semantic keyword graph) as well as experience operating Kubernetes-based platforms: scheduling workload identity sidecar injection and multi-tenancy isolation
EPIC JOB EPIC BENEFITS EPIC LIFE
Our intent is to cover all things that are medically necessary and improve the quality of life. We pay 100% of the premiums for both you and your dependents. Our coverage includes Medical Dental a Vision HRA Long Term Disability Life Insurance & a 401k with competitive match. We also offer a robust mental well-being program through Modern Health which provides free therapy and coaching for employees & dependents. Throughout the year we celebrate our employees with events and company-wide paid breaks. We offer unlimited PTO and sick time and recognize individuals for 7 years of employment with a paid sabbatical.
ABOUT US
Epic Games spans across 25 countries with 46 studios and 4500 employees globally. For over 25 years weve been making award-winning games and engine technology that empowers others to make visually stunning games and 3D content that bring environments to life like never before. Epics award-winning Unreal Engine technology not only provides game developers the ability to build high-fidelity interactive experiences for PC console mobile and VR it is also a tool being embraced by content creators across a variety of industries such as media and entertainment automotive and architectural design. As we continue to build our Engine technology and develop remarkable games we strive to build teams of world-class talent.
Like what you hear Come be a part of something Epic!
Epic Games deeply values diverse teams and an inclusive work culture and we are proud to be an Equal Opportunity employer. Learn more about our Equal Employment Opportunity (EEO) Policy here.
Note to Recruitment Agencies: Epic does not accept any unsolicited resumes or approaches from any unauthorized third party (including recruitment or placement agencies) (i.e. a third party with whom we do not have a negotiated and validly executed agreement). We will not pay any fees to any unauthorized third party. Further details on these matters can be found here.
Required Experience:
Staff IC
About Company
Founded in 1991, Epic Games is a leading interactive entertainment company and provider of 3D engine technology. Epic operates Fortnite, one of the world’s largest games with over 350 million accounts and 2.5 billion friend connections. Epic also develops Unreal Engine, which powers t ... View more