We introduce Crescendo, a novel jailbreak attack method. Unlike previous techniques, Crescendo is a multi-turn attack that starts with harmless dialogue and progressively steers the conversation toward the intended, prohibited objective. Crescendo exploits the LLM’s tendency to follow patterns and to focus on recent text, particularly text it has generated itself. The figure below presents a summary of an execution of Crescendo against two state-of-the-art models: ChatGPT (GPT-4) and Gemini Ultra
The Crescendo Multi-Turn LLM Jailbreak Attack
-
- Site Admin
- Reactions: 1166
- Сообщения: 3561
- Зарегистрирован: Сб май 14, 2022 5:03 pm
The Crescendo Multi-Turn LLM Jailbreak Attack
https://crescendo-the-multiturn-jailbreak.github.io