AI Engineer
Job Summary
About TryHackMe
TryHackMe is a cybersecurity education platform used by 7 million security practitioners worldwide. We build the tools that help security teams learn practise and stay sharp - from foundational skills training through to enterprise-grade capability testing.
Live Breach is our newest product: a high-fidelity breach simulation experience for enterprise security teams. We provision real cloud infrastructure deploy realistic attack scenarios against it and challenge the blue team to investigate contain and eradicate - end to end in real time. The goal is to make it feel indistinguishable from a real incident.
Were hiring a contractor to own the AI engineering at the core of this product.
The role
Youll build and own the AI systems that make Live Breach feel like a real incident rather than a scripted exercise. That centres on two interconnected components:
The AI attacker agent - an autonomous LLM-powered agent that receives a threat actor profile a network briefing and a configured attack chain then executes it against a live environment. The core engineering challenge is making this agent adaptive: when the defending team takes containment actions in real time the attacker needs to recognise what has happened and respond pivoting to new hosts re-establishing persistence or changing technique.
The exercise orchestration layer - a parallel system that monitors the network during a live exercise recognises which attack techniques have executed listens for correct containment and eradication actions from participants and surfaces investigation tasks tied to real attacker behaviour. This system needs precise programmatic knowledge of what forensic artefacts each technique produces and what a valid defensive response looks like.
Alongside these core systems youll also work on:
Prompt engineering for both red and blue team agent components in close collaboration with our content engineering team
Integration with adversary emulation tooling for realistic technique execution
User emulation and noise generation simulating realistic background activity so participants must distinguish real attacker behaviour from normal log volume
Documentation and architecture that allows the broader engineering team to operate and debug the AI layer without dependency on any single person
What youll be working on first
The immediate priority is building the attacker agent from the ground up. Youll design and implement an autonomous LLM-powered agent that receives a threat actor profile a network map and a configured attack chain and executes it against a live provisioned environment without human intervention.
This means: scoping the agent architecture choosing the right tooling and framework building the planning and execution loop and getting to a working demo where the agent autonomously compromises a target network end to end.
A key part of this work will be using LLMs to analyse an attack chain and automatically configure the target network to be vulnerable in the right ways introducing the misconfigurations weak credentials and exploitable conditions that the attack chain requires without manual setup for each scenario. Speed matters here we want to prove the core capability as quickly as possible so we can validate it with real clients.
What were looking for
Essential
Hands-on experience building LLM-powered agents planning loops tool use memory state management
Strong Python engineering; able to ship production-quality agentic systems not just prototypes
Ability to design prompts and agent architectures that are reliable and predictable under adversarial conditions
Comfortable working autonomously on problems that arent fully defined youll need to make good technical decisions with limited hand-holding
Strong async communication; the team is distributed and documentation matters
Strongly preferred
Working familiarity with cybersecurity concepts attack techniques MITRE ATT&CK network fundamentals (Active Directory lateral movement persistence). You dont need to be a penetration tester but you need enough domain fluency to build realistic attack logic
Experience with adversary emulation frameworks (MITRE CALDERA or similar)
Experience building event-driven systems that monitor and react to real-time state changes
Familiarity with cloud infrastructure (we provision VMs and networks dynamically per exercise)
Nice to have
Prior work in the cyber range red team tooling or security simulation space
Experience with multi-agent architectures where agents observe and react to each other
What you wont own
To set clear expectations: cloud infrastructure provisioning is owned by our backend engineers. Domain validation - confirming that attack chains are realistic and forensic artefacts are correct - is owned by our content engineering team. Your scope is the AI orchestration and agent layer that sits on top of the provisioned environment.
The team and how we work
Youll join a small senior team working directly on this product. We work async-first with regular syncs and occasional in-person build sprints in London. We move fast communicate directly and expect contractors to raise risks early. What you build will go in front of enterprise clients quickly - the quality bar is real.
Contract details
Remote with occasional travel to London for in-person build sprints
Initial engagement: 36 months
About Company
TryHackMe takes the pain out of learning and teaching cyber security. Our platform makes it a comfortable experience to learn by designing prebuilt courses that include virtual machines (VM) hosted in the cloud and ready to be deployed. This avoids the hassle of downloading and config ... View more