About Centific
Centific is a frontier AI data foundry that curates diverse high-quality data using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe scalable AI deployment. Our team includes more than 150 PhDs and data scientists along with more than 4000 AI practitioners and engineers. We harness the power of an integrated solution ecosystemcomprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 marketsto create contextual multilingual pre-trained datasets; fine-tuned industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovation solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.
Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.
About Job
Internship: AI Safety Jailbreaking Attacks & Defense Agentic AI Human Behavior
(Ph.D. Research Intern)
Company: Centific
Location: Seattle WA (or Remote)
Type: Full-time Internship - 40 hours per week
Build the Future of Safe and Responsible AI
Are you advancing the frontiers of AI safety LLM jailbreak detection and defense and agentic AIwith publications to show for it Join us to translate pioneering research into robust security and trustworthy LLM systems that resist adversarial and behavioral exploits.
The Mission
Were tackling cutting-edge AI safety across adversarial robustness jailbreak defense agentic workflows and human-in-the-loop risk modeling. As a Ph.D. Research Intern youll own high-impact experiments from concept to prototype to deployable modules directly contributing to our platforms security guarantees.
What Youll Do
- Advance AI Safety: Design implement and evaluate attack and defense strategies for LLM jailbreaks (prompt injection obfuscation narrative red teaming).
- Evaluate AI Behavior: Analyze and simulate human-AI interaction patterns to uncover behavioral vulnerabilities social engineering risks and over-defensive vs. permissive response tradeoffs.
- Agentic AI Security: Prototype workflows for multi-agent safety (e.g. agent self-checks regulatory compliance defense chains) that span perception reasoning and action.
- Benchmark & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety over-defensiveness adversarial resilience and defense effectiveness across diverse models (including latest benchmarks and real-world exploit scenarios).
- Deploy and Monitor: Package research into robust monitorable AI services using modern stacks (Kubernetes Docker Ray FastAPI); integrate safety telemetry anomaly detection and continuous red-teaming.
Example Problems You Might Tackle
- Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o GPT-5 LLaMA Mistral Gemma etc.) uncovering novel exploits and defense gaps.
- Multi-turn Obfuscation Defense: Implement context-aware multi-turn attack detection and guardrail mechanisms including countermeasures for obfuscated prompts (e.g. StringJoin narrative exploits).
- Agent Self-Regulation: Develop agentic architectures for autonomous self-check and self-correct minimizing risk in complex multi-agent environments.
- Human-Centered Safety: Study human behavior models in adversarial contextshow users probe trick or manipulate LLMs and how defenses can adapt without excessive over-defensiveness.
Minimum Qualifications
- Ph.D. student in CS/EE/ML/Security (or related); actively publishing in AI Safety NLP robustness or adversarial ML (ACL NeurIPS BlackHat IEEE S&P etc.).
- Strong Python and PyTorch/JAX skills; comfort with toolkits for language models benchmarking and simulation.
- Demonstrated research in at least one of: LLM jailbreak attacks/defense agentic AI safety human-AI interaction vulnerabilities.
- Proven ability to go from concept code experiment result with rigorous tracking and ablation studies.
Preferred Qualifications
- Experience in adversarial prompt engineering jailbreak detection (narrative obfuscated sequential attacks).
- Prior work on multi-agent architectures or robust defense strategies for LLMs.
- Familiarity with red-teaming synthetic behavioral data and regulatory safety standards.
- Scalable training and deployment: Ray distributed evaluation CI/telemetry for defense protocols.
- Public code artifacts (GitHub) and first-author publications or strong open-source impact.
Our Stack (youll touch a subset)
- Modeling: PyTorch/JAX Hugging Face OpenMMLab Mistral LLaMA
- Safety: Red-teaming frameworks LLM benchmarking (SODE ART) human behavior simulation
- Systems: Python Ray Kubernetes Docker FastAPI Triton Weights & Biases
- Defense Pipelines: Context-aware filtering prompt manipulation detection anomaly telemetry
What Success Looks Like
- A publishable outcome (with company approval) or production-ready module measurably improving safety KPIs: adversarial robustness over-defensiveness and incident response latency.
- Clean reproducible code with documented ablations and end-to-end rerun reports for safety benchmarks.
- A demo that communicates capabilities limits and next steps in defense and security assurance.
Why Centific
- Real Impact: Your research shipsdirectly securing our core features and AI infrastructure.
- Mentorship: Collaborate with Principal Architects and senior researchers in AI safety and adversarial ML.
- Velocity Rigor: Balance high-quality research with mission-critical product focus.
$30-50 per hour
How to Apply
Email your CV publication list/Google Scholar and GitHub (or code artifacts/videos) to with the subject line:
Centific is an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race color religion national origin ancestry citizenship status age mental or physical disability medical condition sex (including pregnancy) gender identity or expression sexual orientation marital status familial status veteran status or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories consistent with legal requirements.
Required Experience:
Intern