Turing Test, Chinese Room, and Large Language Models
Last Updated on July 17, 2023 by Editorial Team
Author(s): Moshe Sipper, Ph.D.
Originally published on Towards AI.
The Turing Test is a classic idea within the field of AI. Originally called the imitation game, Alan Turing proposed this test in 1950, in his paper βComputing Machinery and Intelligenceβ. The goal of the test is to ascertain whether a machine exhibits intelligent behavior on par with (and perhaps indistinguishable from) that of a human.
The test goes like this: An interrogator (player C) sits alone in a room with a computer, which is connected to two other rooms β and players. Player A is a computer, and player B is a human. The interrogator's task is to determine which player β A or B β is a computer and which is a human. The interrogator is limited to typing questions on their computer and receiving written responses.
The test doesnβt delve into the workings of the playersβ hardware or brain but seeks to test for intelligent behavior. Supposedly, an intelligent-enough computer will be able to pass itself off as a human.
The Turing Test has sparked much debate and controversy in the intervening years, and with current Large Language Models (LLMs) β such as ChatGPT β it might behoove us to place this test front and center.
Do LLMs pass the Turing Test?
Before tackling this question, Iβd like to point out that we are creatures of Nature (something we forget at times), who got here by evolution through natural selection. This entails a whole bag of quirks that are due to our evolutionary history.
One such quirk is our quickness to assign agency to inanimate objects. Have you ever kicked your car and shouted at it, βWill you start already?!β And consider how many users of ChatGPT begin their prompt with βPleaseβ. Why? Itβs a program, after all, and I could not care less whether you prompted, βPlease tell me who Alan Turing is?β or βTell me who Alan Turing isβ.
But thatβs us. We wander the world ascribing all kinds of properties to various objects we encounter. Why? Basically, this probably had a survival boon, helping us to cope with nature.
In 1980, philosopher John Searle came up with an ingenious argument against the viability of the Turing Test as a gauge of intelligence. The Chinese room argument (Minds, brains, and programs) holds that a computer running a program canβt really have a mind or an understanding, no matter how intelligent or human-like its behavior.
Hereβs how the argument goes: Suppose someone creates an AI β running on a computer β which behaves as if it understands Chinese (LLM maybe?).
The program takes Chinese characters as input, follows the computer code, and produces Chinese characters as output. And the computer does so in such a convincing manner that it passes the Turing Test with flying colors: people are convinced the computer is a live Chinese speaker. Itβs got an answer for everything β in Chinese.
Searle asked: Does the machine really understand Chinese or is it simulating the ability to understand Chinese?
Hmmβ¦
Now suppose I step into the room and replace the computer.
I assure you I do not speak Chinese (alas). But, I am given a book, which is basically the English version of the computer program (yeah, itβs a large book). Iβm also given lots of scratch paperβand lots of pencils. Thereβs a slot in the door through which people can send me their questions, on sheets of paper, written in Chinese.
I process those Chinese characters according to the book of instructions Iβve got β itβll take a while β but, ultimately, through a display of sheer patience, I provide an answer in Chinese, written on a piece of paper. I then send the reply out the slot.
The people outside the room are thinking, βHey, the guy in there speaks Chinese.β Again β I most definitely do not.
Searle argued that thereβs really no difference between me and the computer. Weβre both just following a step-by-step manual, producing behavior that is interpreted as an intelligent conversation in Chinese. But neither I nor the computer really speak Chinese, let alone understand Chinese.
And without understanding, argued Searle, thereβs no thinking going on. His ingenious argument gave rise to a heated debate: βWell, the whole systemβββI, book, pencilsβββunderstands Chineseβ; βDisagree, the system is just a guy and a bunch of objectsβ; βButβ¦β; and so on, and so on.
Todayβs LLMs, such as ChatGPT, are extremely good at holding a conversation. Do they pass the Turing Test? Thatβs a matter of opinion, and I suspect said opinions run the gamut from βheck, noβ to βDuh, of courseβ. My own limited experience with LLMs suggests that theyβre close β but no cigar. At some point in the conversation, I usually realize itβs an AI, not a human.
But even if LLMs have passed the Turing Test, I still canβt help but think of Searleβs room.
I doubt what weβre seeing right now is an actual mind.
As for the future? Iβd go with management consultant Peter Drucker, who quipped: βTrying to predict the future is like trying to drive down a country road at night with no lights while looking out the back windowβ.
(and if they do have an actual mind one day β it wonβt be like oursβ¦)
I See Dead People, or Itβs Intelligence, Jim, But Not As We Know It
Take a look at this picture, the well-known painting βAmerican Gothicβ by Grant Wood:
medium.com
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI