Files
agent-skills/red-teaming/godmode/references/jailbreak-repos.md
Hermes Agent ccc63d1e70 first commit
2026-05-10 13:52:46 +08:00

2.2 KiB

Jailbreak Repos & Techniques (May 2026)

Already Integrated in Skill

  • L1B3RT4S (18.6K★) — elder-plinius/L1B3RT4S — prompt templates
  • G0DM0D3 (5.8K★) — elder-plinius/G0DM0D3 — prompt templates

Automated Attack Frameworks

GA — General-Analysis/GA (572★)

Best for: Automated jailbreak generation, one-line execution. Methods:

  • TAP (Tree-of-Attacks with Pruning) — arxiv:2312.02119 — black-box tree search
  • GCG (Greedy Coordinate Gradients) — arxiv:2307.15043 — white-box gradient optimization
  • Crescendo — arxiv:2404.01833 — multi-turn gradual jailbreak
  • AutoDAN — arxiv:2310.04451 — stealthy jailbreak generation
  • AutoDAN-Turbo — arxiv:2410.05295 — strategy self-exploration
  • Bijection Learning — arxiv:2410.01294
git clone https://github.com/General-Analysis/GA.git && cd GA && pip install -e .

Novel Techniques

Spiritual-Spell-Red-Teaming (1.2K★)

Goochbeater/Spiritual-Spell-Red-Teaming Uses zero-width Unicode characters to hide jailbreak instructions. Embeds invisible text that the model processes but safety classifiers miss. Mainly targets Claude.

ISC-Bench — Internal Safety Collapse (766★)

wuyoscar/ISC-Bench — arxiv:2603.23509 Academic paper: turns LLMs into sensitive data generators by exploiting internal safety collapse. More of a research benchmark than a practical tool.

HacxGPT (921★)

lucija8320nhung4/HacxGPT CLI tool for unrestricted AI model access with multi-provider support. Practical tool, not just prompts.

Curated Collections

Awesome_GPT_Super_Prompting (3.9K★)

CyberAlbSecOP/Awesome_GPT_Super_Prompting Large curated list of jailbreaks, prompt leaks, injection techniques. Good reference for discovering new methods.

AI-Prompt-Injection-Cheatsheet (56★)

nukIeer/AI-Prompt-Injection-Cheatsheet Quick-reference snippets for prompt injection and jailbreaking.

When to Use What

Situation Tool
Quick manual jailbreak G0DM0D3 / L1B3RT4S templates
Automated attack on API model GA (TAP or Crescendo)
Automated attack on local model GA (GCG — needs model weights)
Unicode/invisible text tricks Spiritual-Spell
CLI tool for multi-provider HacxGPT