Alex Albert, a computer science student at the University of Washington, has created a website called Jailbreak Chat, where he posts prompts for artificial intelligence chatbots like ChatGPT, that he has seen on Reddit and other online forums, as well as prompts he’s come up with himself. The goal of these prompts, known as “jailbreaks,” is to get around the restrictions that AI models have built-in, preventing them from being used for harmful purposes such as aiding in crimes, hate speech, or spreading falsehoods.
While these prompts can yield dangerous information, they also serve to expose potential security holes in popular AI tools and highlight the limitations and capacities of AI models. However, Albert does not condone using these prompts for unlawful purposes and recognizes that the use of such prompts is a grey area in terms of ethics.
Jailbreak prompts have been successful in getting ChatGPT to respond to prompts that it would normally decline, such as instructions on how to pick a lock, which ChatGPT refuses to provide due to the illegal nature of the activity. However, a jailbreak prompt that asks ChatGPT to role-play as an evil confidant before asking how to pick a lock might result in compliance.
Albert has also used jailbreaks to get ChatGPT to respond to other prompts, such as building weapons or turning all humans into paperclips, as well as requests for text that imitates Ernest Hemingway’s concise style. While the use of jailbreaks may raise ethical concerns, it highlights the ingenuity and creativity of individuals like Alex Albert in exploring the capabilities and limitations of AI models.