When Machines Hallucinate
Every cloud has a silver lining: we could enter a golden age of fact-checking
"Hallucinations" by Sprengben is licensed under CC BY-NC-SA 2.0.
Many of us were taken aback when ChatGPT was launched at the end of November 2022. It was so fluent! Since then, though, doubts have begun to creep in. Some of its suggestions are plausible but incorrect. Experts in artificial intelligence (AI) have begun to warn the public about hallucinations - alternative realities presented with authority and confidence as if they were true. The number of users is dropping.
At the same time, nearly all students have become very loyal users of the tool, particularly when it comes to doing homework. This is not surprising, given that the latest version of the chatbot is already able to get an A grade in a university-level economics exam set by AI sceptic Bryan Caplan. In fact, economist and blogger Noah Smith says that generative AI will help low performers more than it will people at the top of their game, at least at the beginning.
Luckily, the combination of AI hallucinations and mediocre students seeking shortcuts can help us create a golden age of fact-checking. At least one university professor has set students the following challenge: get ChatGPT to do your homework and then grade it.* The more mistakes and hallucinations that students find, the better their grade will be.
This is an excellent idea! I think many more teachers should copy it. If this turns into a mass movement, it could reverse many of the negative impacts of social media on citizens seeking to understand the world around them.
Algorithmic content has proved horrible for the news business. People get to pick and choose content based on how it makes them feel rather whether it is backed by evidence from credible sources. This has created an opportunity for authoritarian regimes and conspiracy entrepreneurs to seek to undermine the institutions that are meant to keep us safe by spreading nonsense about vaccines and elites.
Training students to suspend judgement on the results of generative AI could be a game-changer. The central lesson is one that we have discussed before: check everything! If there isn’t a clear answer one way or another, students should ponder which argument is more probable.
Why not stop reading my blog and try it for yourself! If you’re a teacher, set the exercise as homework. If you’re no longer engaged with formal education, ask ChatGPT about a subject you’d like to know more about and then go through it critically. Make sure you think about your methodology as you do this. What makes one source more valid than another? How do you weight first-hand knowledge? Or expertise vs crankiness?
The comments are open - I’d love to know your results. Did you (or your students) do well with the exercise? How easy or difficult was it? What were the difficulties with the research that the chatbot provided? See you next week!
Further Reading
Do Androids Dream of Electric Sheep by Philip K. Dick
Second guest post from ChatGPT
*I forgot where I read this and can’t find the link again, so apologies for not naming the teacher who came up with magnificent idea!
Sharpen Your Axe is a project to develop a community who want to think critically about the media, conspiracy theories and current affairs without getting conned by gurus selling fringe views. Please subscribe to get this content in your inbox every week. Shares on social media are appreciated!
If this is the first post you have seen, I recommend starting with the second anniversary post. You can also find an ultra-cheap Kindle book here. If you want to read the book on your phone, tablet or computer, you can download the Kindle software for Android, Apple or Windows for free.
Opinions expressed on Substack and Substack Notes, as well as on Bluesky, Mastodon, Post and X (formerly Twitter), are those of Rupert Cocke as an individual and do not reflect the opinions or views of the organization where he works or its subsidiaries.
I have spent too much time arguing with ChatGPT already: it always misrepresents the subjects it is speaking about. In my opinion it is like Lem's demon: https://empathy.guru/2016/05/28/maxwells-demonsthe-pirate-pugg-and-the-true-nature-of-the-internet/, which spouts true random facts which are worthless. In this case it is even worse, because they are fabricated lies but which look authentic. The true value is in fact-checking the thing as you say.
A more accurate term for the LLM behaviour would be "confabulation", not "hallucination".
I'm tilting against windmills here - the talking heads have decided to call the LLM behaviour "hallucination". But it's not hallucination, which implies all kinds of things which are not true about the supposed hallucinators.
I half suspect the term "hallucination" came from people who wanted the ill-informed to believe that LLMs were actually intelligent, even something a lot like people. But maybe it came from non-technical naifs who already at least half-believed the LLMs were actually intelligent in some real sense. Or maybe it came from people of limited vocabulary, who'd never heard of confabulation. Or maybe it started as a metaphor among technical people, but now has new life as a term-of-art commonly misunderstood by outsiders.
Whatever the reason, the term "hallucinate" is misleading and inappropriate.
Other than that, I hope you are right that people will respond to the behaviour of LLMs by becoming less credulous, but I don't share your optimism. People believe whatever their in-group believes, or whatever makes them feel good, or whatever leaders they (foolishly) trust proclaim as truth. We have AFAICT decreasing ability to fact check those proclamations, compared to in my mostly pre-digital youth. So why not believe whatever implausible rubbish comes your way? Even if you discard theories like politician-you-hate is a Grey Alien and inconvenient-well-attested-bit-of-history is false, you still won't know whether any particular news report is true.