Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

The Reality of AI’s Limits: Computational Boundaries of Neural Networks
Artificial Intelligence   Latest   Machine Learning

The Reality of AI’s Limits: Computational Boundaries of Neural Networks

Last Updated on March 25, 2024 by Editorial Team

Author(s): Matej Hladky

Originally published on Towards AI.

As we navigate through an era where Artificial Intelligence (AI) breakthroughs happen almost daily, it might seem there’s no limit to what AI can do. Is there a ceiling to the current direction of AI’s capabilities at large?

Photo by Tara Winstead from Pexels.

This question isn’t just a philosophical musing; it ties directly into the computational underpinnings of AI — a foundation we must examine to understand the bounds of what can and cannot be done. This article aims to bridge the worlds of AI and theoretical computation, shedding light on modern AI’s backbone — neural networks — through the lens of computation theory. Specifically, we’ll delve into the capabilities and limitations of a single neuron.

Simplified Theory of Computation

Computation, in essence, is about processing information.¹ The theory of computation deals with the questions of what can and cannot be computed, the efficiency of these computations, and their practical execution. We can think of computation as the actual, physical process performed by a computer, but let’s abstract away and think beyond the whirring of machines and blinking lights. At its core, computation is about performing calculations — both arithmetic (like addition or subtraction) and non-arithmetic (logical reasoning or pattern recognition), all according to a set of well-defined rules. Whether it’s a simple calculator or a complex AI system recognizing speech, the underlying process can be described using computation theory.

Historical Context and Significance

The notion of calculation being “well-defined” is crucial. Simply put, the well-defined calculation is a clear, unambiguous statement that can be precisely described in a computational model. Although many formal definitions have been proposed, especially during the 1930s², by far the most popular is the definition of Turing-computability, introduced by Alan Turing, which states that a well-defined statement is one that can be expressed in terms of initialization parameters of a Turing Machine — a theoretical model of computation capable of simulating any algorithm’s logic.³

Turing Machine and Models of Computation

While a detailed description of the Turing Machine is beyond the scope of this article, its foundational concept is relatively straightforward yet immensely powerful — it consists of an infinite tape acting as memory, a head that reads and writes data on the tape, and a set of rules that dictate the machine’s operations based on the current state and tape symbol it reads.
It turns out that this relatively simple idea is powerful enough to compute anything computable (under the Turing-computability definition).
In other words, if there’s a problem that is computable, we can design a Turing Machine that can solve it.

Turing Machine isn’t the only model of computation — you might have heard of models like Finite State Machines, lambda calculus, cellular automata, digital gates and more,⁴ some of which are equivalent in computational power to a Turing Machine — those are called Turing-complete, meaning they can simulate any computation that a Turing Machine can. For example, any modern programming language is Turing-complete (yes, even JavaScript), and so are digital logic gates, which are of interest to our purposes.

Computation with Digital Logic Gates

Digital logic gates, built out of transistors, are the fundamental computation units of computer hardware. They form the basis of many electronic devices, most notably the CPU — the “brain” of a computer, responsible for interpreting and executing instructions from a computer program. Gates perform basic logical operations using binary inputs to produce binary outputs — you’re probably familiar with the logical AND, NOT, XOR or NAND. Each gate performs a distinct function — AND outputs a 1 only if all inputs are 1; NOT inverts the input; XOR outputs a 1 if the inputs are different.

Examples of digital logic gates with their corresponding truth tables. Image by the author.

Importantly for our purposes, digital logic gates are Turing-complete — just like a Turing Machine, they can compute anything that is computable.⁵

The Universality of the NAND Gate

Of particular interest is the NAND gate, which is a combination of the NOT and AND gates — it outputs a 0 only if all inputs are 1. The interesting thing about NAND is that it’s an example of a universal gate⁶ (more formally, this property is called functional completeness⁷) — they can be used (and they are!) to build any other gate. Here’s an example of an AND gate built using two NAND gates — try writing down the truth table!

Example of an AND gate built with two NAND gates. Image by the author.

Can you come up with a wiring of NAND gates to build an OR gate (outputs a 1 if either of the inputs is 1)?

Imagine — the simplicity of a circuit so fundamental that a child can comprehend it, yet capable of building the most intricate computational systems that define modern computing. Also, did you notice the “double universality?” Digital gates, in a sense, are universal, being Turing‑complete and thus capable of computing anything. The NAND gate adds another layer to this universality, as it can be used to construct any other gate. It’s like a dream for LEGO fans — the ultimate “brick” that can be used to build any other brick.

Computational Limits of Neural Networks

Having seen various models of computation, including the Turing Machine and other Turing-complete models like digital logic gates, we now turn our attention to neural networks. Clearly, they carry out some kind of computation, executing tasks by processing vast arrays of data. This leads us to question their computational scope: Are neural networks Turing‑complete, or do they encounter limitations with certain computations?

At first glance, determining the computational capacity of neural networks might seem to require a deep dive into complex maths. However, the essence of the inquiry can be approached through a surprisingly straightforward analogy.

Neuron as a Model Of Computation

Consider for a moment the functionality of a single neuron — it receives inputs, applies weights to them, sums these weighted inputs along with a bias, and passes this sum through an activation function to produce an output.

A schematic representation of an artificial neuron. Image by the author.

This looks a lot like logic gates. Although neurons deal with continuous values rather than the binary data of logic gates, the fundamental process of taking inputs, applying a transformation, and producing an output remains consistent across both. This resemblance begs the question: could the computational limitations of neurons be analogous to those of digital logic gates?

And indeed — if a neuron can learn such weights as to represent a NAND gate, then, in theory, a network of such neurons could simulate any computable function.⁸ Here’s an example of weights (+ bias) configuration which results in a NAND functionality:

A single neuron configured to function as a NAND gate, alongside its truth table. Image by the author.

This not only shows the Turing‑completeness of a neuron, but it also means we can implement any digital logic circuit using neurons. Here’s an example of a simple neural network implementing a 1-bit Adder (biases are shown inside the neuron):

A 1-bit Adder built using neurons. Image by the author.

While the analogy of neurons to NAND gates provides a fascinating theoretical foundation for the Turing-completeness of neural networks, it’s important to clarify that this isn’t the typical approach taken in practical AI development. The theoretical possibility of neurons emulating digital logic circuits serves more as a proof of concept, underscoring the computational potential of neural networks. In reality, AI research and neural network design focus on optimizing networks for specific tasks — such as image recognition, natural language processing, or decision-making — rather than directly implementing digital logic operations. This distinction highlights the difference between the theoretical potential of neural networks’ capabilities and their practical applications, where the goal is to leverage their learning ability to solve complex problems rather than to replicate the functionality of digital logic gates.

The “Real” Limits of AI

The goal of the article is to get a glimpse into the connection between the theoretical underpinning of computer science and the operational principles of AI — by demonstrating that neurons can emulate the functionality of universal logic gates, we uncover the potential of Turing‑completeness of neural networks.

However, while this argument illustrates the theoretical power of neural networks, it’s important to acknowledge the practical limitations that currently exist. The Turing-completeness of neural networks does not automatically translate into the ability to solve all computable problems efficiently. Issues related to data requirements, training complexity, resource constraints, and the inherent challenges in modelling certain types of computations still present significant hurdles.⁹

In conclusion, while neurons and neural networks embody a remarkable computational potential that aligns with the most foundational concepts of computer science, realizing this potential in practice requires ongoing research and innovation — and the sky is the limit!

References

[1] Sipser, M. (2013) Introduction to the Theory of Computation. Boston, MA: Course Technology.

[2] Davis, M. D. (1959) ‘Computability and Unsolvability’. In: McGraw-Hill Series in Information Processing and Computers. Available at: https://api.semanticscholar.org/CorpusID:32008469 [Accessed 17 March 2024].

[3] Turing, A. M. (1936) ‘On Computable Numbers, with an Application to the Entscheidungsproblem’. Proceedings of the London Mathematical Society, 2 (42), pp. 230–265.

[4] Lewis, H. and Papadimitriou, C. H. (1998) Elements of the Theory of Computation. Upper Saddle River, NJ: Prentice-Hall.

[5] Mano, M. M. and Kime, C. (2007) Logic and Computer Design Fundamentals. 4th edn. USA: Prentice Hall Press.

[6] Wakerly, J. F. (2017) Digital Design: Principles and Practices. 5th edn. Pearson.

[7] Enderton, H. B. (1972) A Mathematical Introduction to Logic. Academic Press.

[8] Siegelmann, H. T. and Sontag, E. D. (1992) ‘On the computational power of neural nets’. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory (COLT ‘92), New York, NY, USA: Association for Computing Machinery, pp. 440–449. doi: 10.1145/130385.130432.

[9] Manyika, J., Chui, M. and Schwartz, D. (2018) ‘The real-world potential and limitations of artificial intelligence’. McKinsey & Company. Available at: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/the-real-world-potential-and-limitations-of-artificial-intelligence [Accessed 17 March 2024].

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓