Towards AI Can Help your Team Adopt AI: Corporate Training, Consulting, and Talent Solutions.


Neural Networks Seem Magical? Here’s The Simple, Mathematical Explanation
Artificial Intelligence   Latest   Machine Learning

Neural Networks Seem Magical? Here’s The Simple, Mathematical Explanation

Last Updated on January 5, 2024 by Editorial Team

Author(s): Max Charney

Originally published on Towards AI.


As the title suggests, machine learning models can seem magical — how in the world does a computer determine whether a picture is of a cat or dog? Well, to start answering that question, we need to look at the basis of machine learning: neural networks. These networks rely heavily on mathematics and, in particular, linear algebra.

If you want a glimpse into how neural networks work and discover useful resources to learn more, keep reading.


Let’s quickly look at the power of machine learning, an idea invented in 1959 by Arthur Samuel. Applications range from facial recognition on your phone and medical diagnosis systems to recommendation systems like that of Instagram.

Neural Networks.

Neural networks are the basis of machine learning and are inspired by the human brain, emulating the interconnected communication of biological neurons.

These networks typically consist of three main components: the input layer, the hidden layer(s), and the output layer, which collectively enable the network to transform input information, extract features, and generate an output. Every node, or artificial neuron, links to other nodes in the next layer of the network and is pre-assigned a weight and bias.

Interconnected neural network with weights and biases. Source:

A weight represents the strength of connections between neurons, determining the impact of one neuron on another. Biases play a role in fine-tuning ANNs by providing an additional parameter, or another vector, to adjust each neuron’s output.

In order to compute the final output of any node, the formula Z = wx + b is used, where w is the weight vector, b is the bias vector, x is the input matrix, and Z is the output matrix. The calculated output serves as the input for the next node, and this process can be iterated until a final prediction is generated.

Shows a standard neural network architecture. The circles represent nodes and the solid lines between the nodes represent connections between neurons. The nodes are organized such that an input layer, 3 hidden layers, and an output layer for final predictions exist. Calculations for the node outputs are made using the equation Z = wx + b, and the process for each layer is iterated until a final output or prediction is generated. Source:

Linear Algebra.

Linear algebra is the branch of mathematics concerning linear equations, utilizing vector spaces and matrices.

If you want to learn about linear algebra in depth, there are many good sources online (here are a couple: MIT Open Course Ware, 3Blue1Brown, and even Wikipedia). Here are a few bare-bones terms you need to know.

A vector is a one-dimensional array and, for computer science purposes, stores pieces of data. For example, a vector might contain a series of student grades [85, 92, 78] where each element represents the grade of a specific student in a set order.

Similarly, matrices are multidimensional arrays.

Vectors are one-dimensional, while matrices are multi-dimensional.

Connecting Neural Networks and Linear Algebra.

In the previous formula Z = wx + b, a weight vector (w) is multiplied by an input matrix (x).

Example of vector-matrix multiplication

This product is added to the pre-optimized bias vector (b).

Example of vector addition

The Bigger Picture.

The Z = wx + b formula is the essence of all neural networks. Other factors play a role, such as optimizing parameters (like weight & bias and adding non-linearity through activation functions, but you get the point. By iterating over these equations, the neural network can reach its precise predictions.

Based on the final output of the network, predictions can be made (for instance, in a cat/dog classifying model, final values ≥ 0.5 may indicate a cat).

Pretty cool, right?

Final Thoughts

I have delved into the basis of neural networks, but there are many other mathematical concepts relating to machine learning (with optimization and loss functions among them). You can find more information on them below, and in the future, I may write about them more.

Thanks for reading!

Will our world one day be calculated by a series of equations?

I’ve listed some other resources below that may be of interest.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓