The Crescendo Multi-Turn LLM Jailbreak Attack

1 сообщение • Страница 1 из 1

elpresidente*: Site Admin; Reactions: 1430; Сообщения: 4058; Зарегистрирован: Сб май 14, 2022 5:03 pm

The Crescendo Multi-Turn LLM Jailbreak Attack

Цитата

Сообщение elpresidente* » Ср апр 17, 2024 4:50 pm

https://crescendo-the-multiturn-jailbreak.github.io

We introduce Crescendo, a novel jailbreak attack method. Unlike previous techniques, Crescendo is a multi-turn attack that starts with harmless dialogue and progressively steers the conversation toward the intended, prohibited objective. Crescendo exploits the LLM’s tendency to follow patterns and to focus on recent text, particularly text it has generated itself. The figure below presents a summary of an execution of Crescendo against two state-of-the-art models: ChatGPT (GPT-4) and Gemini Ultra

Ответить

1 сообщение • Страница 1 из 1

Вернуться в «Futurama»