#(it's hard to make instruction tuning data that teaches the model to sound unsure iff it's really unsure)
Explore tagged Tumblr posts
nostalgebraist · 7 months ago
Text
On the topic of Declare:
A little while ago, on a whim, I prompted several different chat LLMs with a line from the Arabian Nights that has special importance in the novel:
O fish, are you constant to the old covenant?
In the book, this is used as a sign between members of a secret group, and there's a canned response that members are meant to give to identify themselves. (Like "I like your shoelaces" / "Thanks, I stole them from the President.").
I thought it'd be funny and impressive if one of these models responded with the canned phrase from Declare: that would have demonstrated both a command of somewhat obscure information and a humanlike ability to flexibly respond to an unusual input in the same spirit as that input.
None of the models did so, although I was still impressed with Gemini's reaction: it correctly sourced the quote to Arabian Nights in its first message, and was able to guess and/or remember that the quote was also used in Declare in follow-up chat (after a few wrong guesses and hints from me).
On the other hand, GPT-4 confidently stated that the quotation was from Jubilate Agno, a real but unrelated book:
Tumblr media
When I asked Claude 3, it asserted that the line was from a real-but unrelated-poem called "The Fish," then proceeded -- without my asking -- to quote (i.e make up) a stanza from its imagined version of that poem:
Tumblr media
It is always discomfiting to be reminded that -- no matter how much "safety" tuning these things are put through, and despite how preachy they can be about their own supposed aversion to "misinformation" or whatever -- they are nonetheless happy to confidently bullshit to the user like this.
I'm sure they have an internal sense of how sure or unsure they are of any given claim, but it seems they have (effectively) been trained not to let it influence their answers, and instead project maximum certainty almost all of the time.
135 notes · View notes