Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Psychopathology of Large Language Models: Foundation Models in a Neurobiological Perspective
Artificial Intelligence   Data Science   Latest   Machine Learning

Psychopathology of Large Language Models: Foundation Models in a Neurobiological Perspective

Last Updated on January 11, 2024 by Editorial Team

Author(s): Alberto Paderno

Originally published on Towards AI.

Psychopathology of Large Language Models: Foundation Models in a Neurobiological Perspective

Optimal Brain Damage, Synaptic Pruning, and the Problem of β€œHallucinations”

Artificial psychosurgery, modifying the architecture to improve the function β€” Image generated by Dall-E 3

The performance of large language models (LLMs) is growing at a breakneck pace. Models provide more coherent and consistent answers, with a progressive and significant reduction of hallucinations. This improvement is mainly related to the overall optimization of model architecture and training data, as well as the constant increase in parameters. Nonetheless, hallucinations are still occurring, sometimes in unexpected ways, and it is still a significant challenge to trace the source of these anomalies. Our understanding of the inner workings of LLMs is less detailed than what one could assume when looking at their widespread application, and this is a considerable limit in specific domains where random and unexpected errors could lead to severe consequences (e.g., healthcare and finance).

Neurodevelopmental Correlates in Artificial Intelligence

Understanding the neural development process in humans can be a useful guide to design and optimize LLMs. The human brain, especially during its development, undergoes various processes that enhance its efficiency and functionality, ensuring that neural circuits are adapted based on environmental interactions. During fetal development, the brain undergoes rapid growth, with an overproduction of synaptic connections. Subsequently, as the individual matures, synaptic pruning refines this neural network by removing redundant connections, thereby enhancing the groundedness and efficiency of neural processes. Neuronal activity plays a key role in this process – synapses that are frequently used and activated are strengthened and preserved, while those that are seldom used are pruned away.

Evolution has selected for this approach in network formation: construction through overabundance and subsequent pruning.

These neural networks are much more robust and efficient than networks that are constructed through other means (1).

Neuropathology of Biological Neural Networks

From the pathological viewpoint, alterations in synaptic pruning are one of the proposed etiological mechanisms behind neurological and psychiatric disorders (2). On one side, over-pruning can contribute to the excessive loss of functional synapses, such as in Alzheimer’s disease. On the other, unbalanced pruning is one of the potential causes of disorders such as autism and schizophrenia, where an inefficient β€œfine-tuning” of synaptic connections might lead to characteristic disease presentations. A dysregulated pruning process may be the source of symptoms such as hallucinations, delusions, and disorganized thinking in schizophrenia, as well as sensory processing challenges and behavioral profiles typical of autism.

Image generated by Dall-E 3

Neuropathology of LLMs

In LLMs, β€˜hallucinations’ typically refer to the generation of nonsensical or nonfactual content that is not grounded in the source material. However, Smith et al. (3) proposed the term β€˜confabulation’ as a more fitting descriptor. In contrast to hallucination, confabulation involves generating incorrect narrative details, influenced by the model’s existing knowledge and contextual understanding, rather than implying any form of sensory perception. This redefinition aligns more closely with the operation of LLMs, which synthesize outputs based on patterns learned from vast datasets rather than experiencing sensory inputs.

In general, the extensive training of LLMs mirrors the initial stages of brain development, where a vast array of neural connections are formed. However, like the human brain, LLMs may require an adjunctive refinement process. This refinement, analogous to synaptic pruning in human development, would involve cleaning and optimizing the model’s architecture. Without this process, an LLM may risk being overwhelmed by β€˜white noise’ β€” excess information or connections that obscure or distort the intended output.

So, the continual improvement of LLMs through methods like pruning may be required to ensure that their outputs are relevant, accurate, and grounded in the source material. As discussed above, these characteristics are particularly relevant when operating in fields where the reliability of the information provided by the LLM is critical for safety reasons. In fact, Elaraby et al. (4) showed that, to date, LLM-generated summaries in the legal or health space sometimes contain inaccurate information with the potential for a real-life negative impact.

A Technical Perspective on LLM Pruning β€” β€œOptimal Brain Damage”

As already described in technical papers, LLM pruning involves the reduction of model size by eliminating weights that contribute minimally to model performance, thus generating a sparser model. This process leads to more efficient models that require less computational power and resources while maintaining or even enhancing performance.

Pruning in LLMs is strikingly similar to synaptic pruning in neurodevelopment. Just as synaptic pruning optimizes neural pathways by removing redundant connections, model pruning in LLMs aims to maintain or enhance performance by removing redundant weights​​.

A fascinating description of the potential impact of model pruning was provided as early as 1989 by LeCun et al. (5) in a paper titled β€œOptimal Brain Damage.” As stated by the authors, by removing unimportant weights from a network, several improvements can be expected: better generalization, fewer required training examples, and improved speed of learning and classification. However, rather than brain damage, this can be viewed as a physiological step in neural structure optimization, a tailored β€œpsychosurgical” approach aimed at favoring the adequate β€œmaturation” of the architecture.

Indeed, it is extremely interesting to notice that, as recently demonstrated by Chrysostomou et al. (6), pruned LLMs tend to hallucinate less than their full-sized counterparts, potentially due to a greater reliance on source input rather than parametric knowledge from pre-training​​.

The Bigger, the Better?

The absence of adequate pruning might be one of the components underlying the presence of hallucinations in LLMs, and improvements in this field may lead to better models without the need for a significant increase in size, challenging the β€œthe bigger, the better” stereotype. However, like synaptic pruning, AI model pruning is a balancing act of removing excess while preserving essential functionalities. The convergence of these biological and computational processes shows a parallel in seeking optimized efficiency and functionality in complex systems.

References

1. Navlakha S, Barth AL, Bar-Joseph Z. Decreasing-Rate Pruning Optimizes the Construction of Efficient and Robust Distributed Networks. Graham LJ, ed. PLoS Comput Biol. 2015;11(7):e1004347. doi:10.1371/journal.pcbi.1004347

2. Xie C, Xiang S, Shen C, et al. A shared neural basis underlying psychiatric comorbidity. Nat Med. 2023;29(5):1232–1242. doi:10.1038/s41591–023–02317–4

3. Smith AL, Greaves F, Panch T. Hallucination or Confabulation? Neuroanatomy as Metaphor in Large Language Models. Berkovsky S, ed. PLOS Digit Health. 2023;2(11):e0000388. doi:10.1371/journal.pdig.0000388

4. Elaraby M, Zhong Y, Litman D. Towards Argument-Aware Abstractive Summarization of Long Legal Opinions with Summary Reranking.

5. LeCun Y, Denker JS, Solla SA. Optimal Brain Damage.

6. Chrysostomou G, Zhao Z, Williams M, Aletras N. Lighter, yet More Faithful: Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization. Published online November 15, 2023. Accessed November 27, 2023. http://arxiv.org/abs/2311.09335

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓