Turing, Winograd, or Whither
Last Updated on July 21, 2023 by Editorial Team
Author(s): Danielle Boccelli
Originally published on Towards AI.
Artificial Intelligence, Machine Intelligence, Opinion
What do we question when we ask if machines can think?
An interesting concept from literary theory states that if a reader wants to make sense of a text, then he will find an interpretation of that text that is consistent with his own world view, or perhaps more precisely, with his view of the world he supposes the text to concern. Oftentimes, to fulfill such a desire requires the reader to fill gaps in his own knowledge, as well as gaps in the logic or rhetoric of the writer by reading between the lines. In this way, all texts are essentially a dialogue initiated by the writer and continued by the reader, with the reader forming, perhaps erroneously, the intentions of the writer.
Upon learning of this concept, I fell enamored with writing poetical nonsense with snippets of text found in books and magazines.α΅ I was excited by the idea of reader attempting to interpret meaning from my curated words and phrases and by doing so finding his own meaning in the resulting lines; perhaps this excitement is a form of sadism β I donβt know β but during my cut-and-paste creative process, each poem began to take on a personal meaning to me, so perhaps not.
But enough about my foibles. Let us now explore the importance of this concept to the field of artificial intelligence, with a particular focus on text generation tasks.
Can machines think?
The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.
ββE. W. Dijkstra
While any question that has sparked as much discussion as to whether a computer can think can surely be classified as interesting based on its very ability to inspire interest, I concede that any question depending primarily on semantic interpretation can be classified as uninteresting, regardless of the extent to which it has been discussed, as any discussion that aims to address such a question is not likely to produce a satisfying and irrefutable yes-or-no answer, and unproductive discussions are perhaps essentially uninteresting.α΅
Either way, to ask if a machine can think, one must first define a machine, as well as the verb to think, which can be difficult if there is no specific machine in question, no clear consensus regarding the limits of that machineβs capacity to process information, and no precise definition of what it means to think.
Rather than address all potential sources of semantic uncertainty, I will focus on just one: What does one mean when one claims to think?
As English-speaking humans typically apply the word, to think can be considered as synonymous with to believe or to be of the opinion; or as the action is undertaken to draw a conclusion, to ponder or to contemplate, to muse or to meditate, to reflect or to deliberate, to abstract or to form opinions, to connect or to become lost; etcetera. Unfortunately, these typical applications are no more indicative of the action underlying thought than the original verb to think, and so they cannot be used to determine if a machine can think.
Can humans think?
From a biological perspective, I have heard it said that, during thought, there is something electrochemical going on in the brain as neurons fire β perhaps in a binary way or perhaps in a way more nuanced, but firing nonetheless β and although machines are surely anencephalic by design, I have also heard it said that they process information in a similar way as a biological brain. However, as with the semantic perspective, this similarity seems to come by way of analogy, rather than identity, as what a machine does simply does not feel like what a biological brain does, which is think.
If to think is to retain its standard anthropocentric connotation, then clearly computers cannot think, as only humans (and perhaps, if one feels generous, other mammals) can think. However, if to think is simplified to mean to produce an output based on an input (with the start and the end defining the process, and the mechanism of action occurring within a black box), then computers can quite obviously think β although such happenings are more commonly referred to as computing, which simply feels more appropriate.
Our hubris regarding thought is perhaps best supported by everyoneβs favorite solipsistic aphorism: I think, therefore I am. In other words, my certainty of my very existence β which is the only thing of which I can be certain β is dependent on my ability to think. When placed on a pedestal like so and because I confidently describe what I do as thinking, I have first-hand knowledge of what it is to think, and I assume that others who are like me, i.e., other humans, can also think (although I have no proof), regardless of what it means physically to form thoughts, to conclude, to ruminate, etcetera, within the black box of my skull, and this undertaking, to have a think, so to speak, is not equivalent to computation, because I do not compute: I think (at least I think I do).
Clearly, even an uninteresting question can become interesting when considered from a semantic perspective, but it is also a rather unproductive line of reasoning.
The Imitation Game
The original question, βCan machines think?β I believe to be too meaningless to deserve discussion.ΒΉ
β A. M. Turing
Although he claims, like Dijkstra, that the question of whether a machine can think is unworthy of discussion, in 1950, in Mind: A Quarterly Review of Psychology and Philosophy, Alan Turing wrote over 10,000 words exploring the topic, refuting claims against the possibility of thinking machines, and devising a test that can be carried out as an alternative to directly asking that despicable question.
In the essay, Turing devises his famous imitation game, which is more commonly referred to today as the Turing test. During the game, an interrogator must try to determine which of a pair of contestants, i.e., a human and a computer, is the human by asking a series of questions. The questions are not limited in their scope, and so the interrogator could ask directly if a contestant is a human or other questions (e.g., What is your favorite food?) that would require the computer contestant to assume a human persona (i.e., lie, or the appropriate computer equivalent) to avoid giving itself away.
Each contestant responds to the questions as he/she/it deems fit (whether through thought, computation, osmosis, or whatever else have you), and in the end, if the machine is incorrectly identified by the interrogator as the human contestant, then it is said to pass the Turing test. In other words, as a result of the game, we obtain one personβs opinion on whether a machine can produce answers to questions that are more convincingly human than those produced by one real member of the species.αΆ
Whither machine learning?
We should consider, however, that in deception, studied and artful deceit is apt to succeed better and more quickly than science.Β²
βJ. R. Pierce
While the Turning test remains perhaps the most famous test of machine intelligence, it has one major flaw: Because the imitation game relies on the subjective judgment of one interrogator and the performance of one human contestant rather than an objective and independent standard, it favors a machineβs ability to deceive over its ability to reason. Therefore, while able to provide food for thought as a thought experiment in ~1950, the test would be quite limited in practice in terms of its ability to evaluate a machineβs capacity to reason.
Turing himself was well aware of the above issue; however, to some extent, it seems as if he believed it to be more a feature than a bug:
It is claimed that the interrogator could distinguish the machine from the man simply by setting them a number of problems in arithmetic. The machine would be unmasked because of its deadly accuracy. The reply to this is simple. The machineβ¦would deliberately introduce mistakes in a manner calculated to confuse the interrogator.ΒΉ
While the ability of a computer to successfully imitate a human is perhaps interesting to certain people or desirable for certain tasks, I do not believe that the original question of whether a computer can think can be satisfactorily addressed by determining if a computer can well imitate a human, as the question of computer thinking, I believe, is more concerned with the application of logic and the propensity for flexible learning exhibited by humans than the human-like outcomes achieved on this basis.
In other words, if a computer were able to apply non-human-like but flexibly learned logic to produce non-human-like but internally consistent results, then it would perhaps be more of a thinking entity than a computer that could simply out-human a human.α΅ α΅
The ELIZA Effect
Deception works at least in part because we are extremely forgiving in terms of what we will accept as legitimate conversation.Β³
β H. J. Levesque
The importance of the above shortcoming of the imitation game as any more than a philosophical exercise can be well exemplified by the ELIZA effect. Named after the infamous ELIZA chatbot, which could convincingly imitate a Rogerian psychoanalyst, the ELIZA effect describes the tendency of a person to read too deeply into the responses generated by a computer program. Under the ELIZA effect, a person interacting with a machine may believe that the responses he has received were generated with all the intentionality of a thinking entity.
The ELIZA effect can occur even if the person is aware that he is interacting with a machine, but it is even more insidious if the person believes he is interacting with another human.β΄ Does this mean machines are sadistic? Probably, but this question is entirely uninteresting, and so I will cover it in great detail in a subsequent article.
Moving on.
If not, then what?
What Turing sought to avoid is the philosophical discussion assuming we were able to produce the intelligent behaviour; but how we get there is wide open, including all sorts of internal activity when all is quiet on the external front.Β³
β H. J. Levesque
When computers were first introduced, the idea of a thinking machine likely seemed farfetched to some, terrifying to others, and fascinating to those with enough prescience and bravery to be neither skeptical nor scared. Therefore, to steer the conversation away from the doubts and concerns of skeptics and cowards, Turing refuted the major criticisms thinking machines received and devised the imitation game to promote what he believed to be the more valuable discussion of methods that can be used to evaluate machines in terms of ability.
However, it is well understood today that, for a machine to be truly intelligent, it must do more than simply produce deceptively human-like responses to inputs. Therefore, as a method for better qualifying machine intelligence, the Winograd Schema (WS) challenge was proposed.
In the WS challenge, a question such as the following could be asked: βMark could not see the stage from behind Paul because he was too short. Who is too short, Mark or Paul?β For such questions of pronoun resolution, a non-trivial understanding of language is required because it is not possible to know a priori who is too short without spatial reasoning. Therefore, while this question can be easily answered by any native English-speaking adult, it is difficult for a computer to answer.
Like Turingβs imitation game, the WS challenge requires a machine to have an understanding of language; however, unlike the game, the challenge is based on yes-no reading comprehension questions rather than a conversational approach. Therefore, the WS challenge is an improvement upon Turingβs imitation game in two main ways: (1) it is not likely to be wooed by a deceptive machine, and (2) it can be graded objectively without expert judges.
Does a computer that can pass the Winograd Schema challenge think? Who knows? β maybe all matter is conscious, maybe life is a simulation.
Final thoughts
When we question machine thinking, learning, and intelligence, we have a tendency to compare computers to humans because we have an impossibly intimate relationship with thought, and so perhaps thinking et al. are the wrong terms to use to describe what a machine does because, to ask if a machine can think, etcetera, the machine must be anthropomorphized to an often inappropriate and uncomfortable extent.
So perhaps we should not inquire of machines using words from our anthropocentric vocabularies, but rather ask if they are accurate enough to accomplish the tasks with which we, as a species, no longer want to be bothered. The question of whether a computer can think is uninteresting because the semantics used to describe the decision-making mechanism and the similarity of the mechanism to that of the human brain do not matter as long as the computer can ethically and accurately accomplish the tasks for which is was programmed.αΆ
However, I believe that, without great advances in the processes by which computers learn, the requirement of a machine to imitate a human in a way that is more sophisticated than parroting and more honest than deception is likely impossible for some tasks, such as friendly communication tasks that require the endearing balance of defense mechanism and malapropism that plagues the average human creature, and entirely irrelevant for others, such as numerical calculations, as the ability of a computer for such tasks already far surpasses that of any human, and it would be plain silly to dull its abilities to match ours.
So perhaps computers simply compute, and only humans think, or perhaps the brain is just a sloppy yet flexibly brilliant computer; perhaps, perhaps, but ~2,600 words later, the topic is still of little interest.
Notes
b. The question of whether a question is interesting is also a question of semantics, and so it is also quite possibly uninteresting, in and of itself, depending, of course, on oneβs definition of interesting; this logic proceeds ad infinitum without becoming any more or less interesting.
c. I do not know what a computer passing the Turing test is to say about the human contestant, but I assume it must say something, that the claim inextricably depends on the sophistication of the computer, and that it is not quite complimentary, regardless of the level of sophistication of the non-human contestant.
d. The same logic can be applied to humans: a person who can memorize the solution to a problem in mathematics does not necessarily understand how to solve the problem, just as a person who behaves as an upstanding citizen is not necessarily pure of thought; however, in some cases, the outcome (i.e., a passing grade or an upstanding citizen), is what one cares about.
e. Further along these lines, to take a rationalist perspective, perhaps the whole endeavor of using human-produced examples to engender artificial intelligence is fundamentally flawed, as such examples can be used to produce only human imitations, not thinking machines.
f. If we were discussing machine consciousness, then I would perhaps have to revise these statements, but that is an entirely different level of philosophizing that is way beyond the scope of this article.
Bibliography
1. TURING, A. M., I. β COMPUTING MACHINERY AND INTELLIGENCE, Mind, Volume LIX, Issue 236, October 1950, Pages 433 β 460.
2. PIERCE, J. 1969. Whither speech recognition? JASA 46 (4B): 1049 β 1051.
3. LEVESQUE, H. J. 2011. The Winograd Schema Challenge. AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
4. Et al. The ELIZA Effect. Wikipedia.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI