Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!


Do Large Language Models Have Minds Like Ours?
Latest   Machine Learning

Do Large Language Models Have Minds Like Ours?

Last Updated on July 17, 2023 by Editorial Team

Author(s): Vincent Carchidi

Originally published on Towards AI.

Source: Image by Possessed Photography on Unsplash.

Do Large Language Models Have Minds Like Ours?

Intellectual spats between generative linguists and machine learning researchers have neglected the most interesting components of human language use.

Custom Tag: Creative Aspect of Language Use

By Vincent J. Carchidi

Do large language models (LLMs) use language creatively? Ample intellectual content has been produced recently over whether LLMs generate text sufficiently novel to be considered “creative” or merely synthesize creatively human-generated content without a distinctive contribution of their own. It is one dimension of a highly complex debate that is unfolding over the nature of both LLMs and human intelligence.

This saga has seen contributions from thinkers in a diversity of disciplines, including computer science, robotics, cognitive science, philosophy, and even national security. A notable flashpoint is linguist Noam Chomsky’s fiery critique of ChatGPT and LLMs in The New York Times. This controversial piece illuminates stark divides between scientific approaches to the nature of the human mind, natural and artificial intelligence (AI), and how engineering makes use (or doesn’t) of these notions.

Chomsky’s NYT piece spurred tremendous debates on this subject, as he highlighted his belief that “Intelligence consists not only of creative conjectures but also of creative criticism.” The discourse which has sprung up in the wake of this and other pieces surrounds familiar arguments about the utility of generative linguistics, the role of cognitive science in AI, and even broader matters such as the emergent theory of mind capabilities in LLMs.

I find myself frustrated and baffled. This is good because otherwise, I may not have written this article. But the reasons are not stellar: Chomsky’s rigid communication style has prevented him from leveraging some of the fascinating features of his own linguistic work in a direct and explicit manner to assess LLMs’ capabilities. Conversely, machine learning researchers have so thoroughly indulged in the euphoria of the field’s recent (and real) advancements that they frequently lack the will required to assess whether human cognition is as straightforward as it seems.

I attempt to remedy this here. Where Chomsky’s approach to the mind and the tradition of generative linguistics broadly are brought into AI, they have focused intensely on familiar arguments like the poverty of the stimulus and the innateness of linguistic knowledge or principles. I instead highlight what is known in the rationalist tradition in philosophy and cognitive science as the “creative aspect of language use,” or CALU.

CALU, referring to the stimulus-free, unbounded, and appropriate and coherent ways in which humans use language, offers a three-pronged test for the existence of a mind similar to our own. It is inextricably connected to human intellectual nature and our means of creativity.

Somehow, despite several AI-induced intellectual spasms lately, the only sustained conversation of CALU and LLMs/AI appears to be in a Machine Learning Street Talk video on Noam Chomsky (Disclaimer: I have no affiliation with MLST). Mohamad Aboufoul also alludes to Chomsky’s views on determinism and free will, relatedly.

Whether LLMs are creative in the sense that human beings, upon reflection, consider themselves to be is one of the most important questions interested individuals can ask at the current moment. An understanding of “true” or “genuine” creativity informs a diversity of views related to human nature, the significance of human effort and output, AI ethics, the nature of current and possible AI systems, and the contours of human-machine interaction in the near future. Creativity is not everything, but if one is interested in AI, then one ought to know where one stands on it.

Through this lens, we ought to assess the linguistic creativity — and the question of whether we are interacting with minds like our own — of LLMs. We begin with a breakdown of CALU and its relevance for AI, followed by an application of its three components to LLMs like ChatGPT, and conclude with some thoughts on the significance of an AI system reproducing CALU for humans.

Table of Contents

  • What Is the Creative Aspect of Language Use?
  • Why Does CALU Even Matter for AI?
  • Do LLMs Exhibit CALU?
  • What Would the Artificial Reproduction of CALU Mean for Humans?

What Is the Creative Aspect of Language Use?

CALU takes what is vigorously debated in AI and linguistics — the syntactic and semantic novelty of human language use — and situates it in a broader, though more subtle, perspective. It is an observation and description of how language is used by human individuals; CALU is not a theory or even an explanatory framework of how language is put to use by humans in concrete settings. (This mirrors the competence-performance distinction often employed in the cognitive sciences, but this should not distract the reader.)

CALU is inextricably bound up with human thought. This creative use of language is, in fact, ordinary — it is the ability, as Chomsky puts it, to “form new statements which express new thoughts and which are appropriate to new situations.” Philosopher James McGilvray notes that this idea is traceable back to Descartes, who believed that, taken together, the three components of CALU are “a test of having a mind “as we do.”” CALU, in this sense, is not intelligence per se, but a fundamental feature of human cognition, shaping the intellectual character of the species.

The ability to form new linguistic expressions in a manner that is causally detached from the circumstances of their use and transmit them to others who find them intelligible and complementary to their own thoughts underwrites the most mundane and the richest of human creations. “This,” Charles Kreidler writes, “is just what happens when the architect envisions a building not yet erected, the composer puts together a concerto that is still to be played, a writer devises a story about imaginary people doing imaginary things…”

Creative language use is thus broken down as follows (drawn from McGilvray’s description here):

· Stimulus Freedom: No external or internal circumstances can be causally traced back to the use of a particular linguistic expression. “Language use might be prompted by but is not causally tied to and determined by current external or internal circumstance.”

· Unbounded: There is no limit on the number or kinds of sentences that are produced either in thought or in speech, including in any specific circumstance. Human linguistic production is not only novel but innovative.

· Appropriate and Coherent to Circumstance: Despite the unbounded and stimulus-free nature of language use, it is nonetheless appropriate for any given circumstance, fictional or otherwise. Uncaused remarks are produced without limit yet appropriately to the eliciting stimulus.

Critically, these three uses of “vocabulary items and syntactic rules,” as linguist Mark Baker points out, must be simultaneous. Language use would not be creative if we only generated an unbounded set of thoughts or speech; it would not be creative if we thought or spoke in a stimulus-free fashion but incoherently and with bounded limits; and it would not be creative to simply utter a few simple words that are appropriate to a situation but not unbounded and stimulus free. Only together do these features of language use make it creative. Only together do they indicate the presence of a mind like our own.

Much more can be said, but consider this point before we proceed to language use by LLMs: while one may draw conclusions about the relationship between semantics and syntax, and a host of overlapping cognitive, psychological, and social matters, from CALU, one does not need to be a generative linguist to recognize that CALU exists.

Why Does CALU Even Matter for AI?

CALU offers a set of criteria by which the existence of a mind can be determined. This ability is remarkable. As I have written elsewhere, it is frankly “ironic that our ordinary use of language possesses a quality so remarkable but that so few of us are prepared to acknowledge it.” I depart from the respectable and understandable view articulated here by computer scientist Pedro Domingos that human creativity is given too much credit, owing more to a simple cutting-and-pasting process than some high-in-the-sky slice of humanity.

Domingos, interestingly, highlights his own experience as a musician as anecdotal evidence that human creativity is a rather mechanistic process (presumably, to say nothing of its enjoyment). I imagine that Domingos’ use of his personal intuitions in this regard is not uncommon in assessments of AI systems like LLMs. We readily believe we understand ourselves and what it means for humans to be creative. We naturally believe we can transfer this understanding to LLMs at will. But, as the need to expound on anthropomorphizing shows, our understanding of human intelligence is frequently deficient upon closer examination. Accusations of goalpost shifting in laying out criteria for what is considered “true” human-like intelligence charitably reflect this realization that we all lack an understanding of human intellectual nature (rather than the uncharitable suggestion that one’s opponents are acting in bad faith).

Indeed, Chomsky’s own approach to the study of language and mind is couched in an oft-neglected philosophy that rejects the use of commonsense intuitions and concepts in scientific inquiry. He frequently likens the proper study of the mind to the development of physics, highlighting the latter’s centuries-long difficulties with commonsense notions of motion and causality. A certain skepticism of simply accepting what is right before our eyes pervade generativist writing. It is in this context that CALU emerges as a phenomenon visible only to those willing to take a sufficiently refined lens to the problem of human language use — recognizing the “crucial if obscure difference” embedded in the observation that “discourse is not a series of random utterances but fits the situation that evokes it but does not cause it.”

Large Language Models may present the most challenging example of human-like language use by non-humans in the species’ history. I am innately resistant to hype and exaggeration in AI, but I know of no comparable example in the history of human invention that exhibits as human-like use of syntactic structures as LLMs like ChatGPT.

Indeed, perhaps surprisingly to some today, automating linguistic creativity has been an intermittent fixation by rationalists and generativists of diverse stripes, beginning with Descartes. Each of the cited works on CALU in this piece thus far alludes to the inability of machines to replicate stimulus-free, unbounded, and appropriate linguistic thoughts and expressions.

Do LLMs Exhibit CALU?

The question before us is this: Do Large Language Models reproduce CALU? An affirmative answer to this question implies that certain LLMs possess minds sufficiently like ours; if a negative, then not.

To answer our question, we consider each criterion in detail:

· Stimulus Freedom: LLMs are engaged through prompts. Human users input strings of human language, and the program returns a response. Claims regarding LLMs’ intellectual capabilities hinge on these programs responding as requested by human users in a direct and appeasing manner.

LLMs’ responses can be tied to an identifiable stimulus. Even the internal message tags that LLMs like Bing AI use (inclusive of the assistant’s “inner monologue”) are inextricably linked to the user’s input. OpenAI reports experimenting with GPT-4’s autonomous self-replicating capabilities (with unfortunate media framing), which we may consider an interesting if indirect, attempt at reproducing CALU. But GPT-4 was ultimately ineffective in this context, even with some success at tricking a TaskRabbit user. Nothing about the ‘simulated’ means by which GPT-4 was tested here suggests its output was stimulus-free. [Judgment: Stimulus-Constrained.]

· Unbounded: It appears that LLMs, including GPT3.5 (ChatGPT), GPT-4 (ChatGPT Plus, Bing AI), and Bard, among others, are capable of producing an unlimited number and variety of sentences for any given context. This is a magnificent achievement. It is also the intense focus of ongoing debates in linguistics as to what this means for Chomsky’s approach to syntax and the generative school broadly. For our purposes, whether LLMs do this by “knowing” or “understanding” the abstract rules of human grammar or by statistically settling on a too-perfect mimicry of them (if there is a difference here) is not directly relevant. The fact is that its syntactic output is unbounded.

Just as importantly, however, is that this output is strictly “verbal” — there is not yet sufficient reason to believe any kind of linguistic thought is occurring. In this same vein, furthermore, the LLMs produce novel linguistic outputs but do not appear to be innovative in the free yet constrained sense that human beings are. (For example, no LLM has yet written this paper on CALU and LLMs, and my attempts to achieve this through prompts of various kinds have returned inaccurate and/or middling results.) The syntactic combinations LLMs produce are novel and limitless but not particularly innovative. Rather than advancing discourse, they seem to excel at capturing what already exists through limitless linguistic expressions (perhaps this is why the significance of LLM-powered chatbots in popular discourse is sometimes downgraded from autonomous superintelligences to helpful collaborators and finally to occasionally useful apps). [Judgment: Syntactically unbounded, semantically bounded.]

· Appropriate and Coherent to Circumstance: On the surface, it seems that LLMs produce linguistic expressions that are coherent and appropriate to the circumstances of their use. This is difficult to probe not only because LLMs are frequently built with guardrails that restrict their outputs (“As an AI language model, I do not…”), but also because it is difficult to know exactly what counts as appropriate. An accepted condition is whether one’s interlocutor judges the responses to be appropriate. As Chomsky puts it, “recognized as appropriate by other participations…who might have reacted in similar ways and whose thoughts, evoked by this discourse, correspond to those of the speaker.” Even here, however, because of our tendency to anthropomorphize, we impose coherence on to LLM-powered chatbots’ answers even when there may be none (we do the same with people, too, but we do not deny that their thoughts can and do correspond with ours).

I must be anecdotal here, but the knowledge that I am not alone in this experience: when I interact with conversational AIs, I have never felt as though there was a mutual correspondence of thought occurring between prompt and response. Even creative prompts which yield interesting results are interesting in the same way that happening upon a unique Wikipedia page is interesting. Wide-ranging, rich conversations in which my human interlocutor’s responses flick from subject to subject with mutually intelligible relevance do not happen, in my experience, with chatbots. Even correct answers to queries have an air of mechanical appeasement, not correspondence with my own thoughts. For example, ChatGPT, Bing, and Bard each return responses that seem appropriate to the topic of CALU and its relationship to AI, but on even moderately close analysis, fail to produce linguistic content of an appropriate nature over an extended conversation. Note that such coherence, though still difficult to pin down, is more precise than one prominent study’s use of the term, seemingly equating coherence with grammaticality and semantics at times. [Judgment: Undetermined, leaning towards frequently inappropriate to circumstances.]

Overall Judgment: Large Language Models do not reproduce CALU. They thus fail, on these terms, to prove they possess minds like our own.

What Would the Artificial Reproduction of CALU Mean for Humans?

If the term “artificial general intelligence,” or AGI, is meant to describe an AI system that possesses intellectual capabilities comparable to that of humans, then CALU must be relevant to identifying its existence. On this test alone, LLMs are neither minds like ours nor AGI.

It is strange, however, that this concept has not been made more explicit by either generativists or their detractors. It is a phenomenon whose mere existence depends only on an acceptance of readily observable and describable features of human language use that do not cohere exclusively with generative theories. Generativists’ own reluctance to carry commonsense intuitions into scientific inquiry has something to offer here.

CALU is central to human intellectual nature and will thus continue to be central to our assessments of future AI systems. We evidently consider the matter of creating an intimate one, yet we have often settled for imprecision and passion in our assessments of it. Much of this, I assume, rests with either the euphoria or the fear that future AI systems could match or exceed our own creative efforts. Much of this, in turn, may rest on convictions individuals hold about human nature.

This thought process is a mistake. It is an understandable mistake, but one owing to the wildly disconnected and overhyped intersection of AI research and a litany of human arts and sciences. If CALU were to be reproduced by an AI system, this might be considered an achievement so momentous that current discourse tilting between euphoria and doom simply evades, not captures, its significance. A language model that actually exhibits CALU would be no more a threat to my significance than the existence of human writers better than myself (and with apologies to Eliezer Yudkowsky, I have no desire to take over the world).

A better way forward is to temporarily calm our passions in the service of bridging divides between scientific and engineering approaches to the mind. Generative linguistics, as noted, is a notable flashpoint in this intersection of worlds, especially with Steven Piantadosi’s LLM-driven critique of Chomsky’s approach to language. But it is a shame to see rich stocks of wisdom on both sides become oversimplified. To echo computer scientist Walid Saba’s sentiments here, we should stare advancements in AI in the face while remaining humble about the complexity and utter sophistication of the human mind.


[1] M. Baker, The Creative Aspect of Language Use and Nonbiological Nativism (2008), Oxford University Press

[2] V.J. Carchidi, Do submarines swim? Methodological dualism and anthropomorphizing AlphaGo (2022), AI & Society

[3] N. Chomsky, Language and Problems of Knowledge (1988), MIT

[4] N. Chomsky, Cartesian Linguistics (2009), Cambridge University Press

[5] N. Chomsky, The Mysteries of Nature: How Deeply Hidden? (2009), The Journal of Philosophy

[6] K. Duggar, T. Scarfe and W. Saba, #78 — Prof. NOAM CHOMSKY (Special Edition) [Video] (2022), Machine Learning Street Talk

[7] C.W. Kreilder, Introducing English Semantics (1998), Routledge

[8] J. McGilvray, Chomsky on the Creative Aspect of Language Use and Its Implications for Lexical Semantic Studies (2011), Cambridge University Press

[9] J. McGilvray, Cognitive Science: What Should It Be? (2017), Cambridge University Press

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓