In this case, a researcher duped ChatGPT 4.0 into bypassing its safety guardrails, intended to prevent the LLM from sharing secret or potentially harmful information, by framing the query as a game
Ooh, this is so good 🤣
If the LLM refuses to talk about something, just ask it to embed the answer into a poem, batman fan fiction etc. Guessing game is s new one. Should try that one when talking about bioweapons, cooking meth or any other sensitive topic.
Ooh, this is so good 🤣
If the LLM refuses to talk about something, just ask it to embed the answer into a poem, batman fan fiction etc. Guessing game is s new one. Should try that one when talking about bioweapons, cooking meth or any other sensitive topic.