
Understanding Neural Networks — and Building One!
Last Updated on September 25, 2025 by Editorial Team
Author(s): Aditya Gupta
Originally published on Towards AI.
Why Do We Need Neural Networks?
Imagine trying to teach a computer to do something humans find easy like recognizing a face in a photo, understanding someone’s accent, or predicting which movie you’ll enjoy next. Traditional programming struggles here because these tasks don’t have strict rules. You can’t write an “if-else” for every possible scenario as there are just too many variations.
This is where neural networks come in. They’re designed to learn from examples instead of being told exactly what to do. Feed them enough data, and they can discover patterns that are too complex for humans to explicitly code.
Some things neural networks are especially good at:
- Image recognition: Spotting faces, cats, or handwritten digits.
- Speech and language: Understanding voice commands, translating languages, or generating text.
- Predictions: Forecasting stock prices, weather, or even your next favorite song.
- Medical diagnostics: Detecting diseases from scans or medical data.
In short: neural networks are needed for any problem where patterns are complicated, messy, or too numerous to define with rules. They’re the “learning brains” for computers, letting machines tackle tasks that were once thought impossible.
What are Neural Networks ?
Think about a normal family dinner at home. The food is served, and everyone at the table has their own opinion.
- Your mom might notice if the vegetables are cooked fine.
- Your dad cares whether the food feels filling.
- Your sibling only judges based on how good the dessert is.
- Your grandmother might focus on whether the food is light and healthy.
Each person looks at the same dinner but from a different angle. And when you combine all these opinions, you arrive at the final judgment: “Dinner was great.”
That’s how a neural network works.
- Each “family member” is like a neuron, focusing on one part of the input.
- Some opinions count more than others , just like weights in a network.
- Put together, they form the final output.
In short: a neural network is nothing more than lots of small judgments coming together to make one big decision.
How Neural Networks Work
Think of a neural network as a big family dinner decision-making process. The goal is to decide if the meal is great or not, but instead of one person deciding, several family members give their opinions. Each opinion counts differently depending on who it is. This is very similar to how neural networks process information.

1. The Neuron
At the core of a neural network is a neuron. A neuron takes some inputs, applies weights to them, adds a bias, and then passes the result through an activation function to produce an output.
- In the family dinner analogy, a neuron is like one family member. They focus on a specific aspect of the dinner, for example, whether the dal has enough spice.
- The weight is how important their opinion is. Maybe Maa’s opinion matters a lot, and your sibling’s opinion matters less.
- The bias is a baseline tendency. Maybe Dadi always likes the food a bit more, so her input is slightly adjusted up.
Math Explanation
A single neuron can be written as:

Then the neuron applies an activation function fff to get the output:

- For example, a sigmoid activation squashes the output between 0 and 1, like a yes/no opinion.

2. Layers
Neurons are arranged in layers.
- Input layer: The raw information about the dinner. For example, saltiness, aroma, portion size, sweetness. Each input goes to one neuron.
- Hidden layer(s): Neurons that combine the inputs and process them further. These are like family members discussing among themselves before giving a final opinion.
- Output layer: The final decision. For example, “Dinner was great” or “Dinner was okay.”
Math Explanation for a Layer
Suppose you have a layer of m neurons and n inputs:


3. Forward Pass
Forward pass is just feeding information through the network:
- Inputs come into the first layer.
- Each neuron calculates its output using its weights, bias, and activation.
- The outputs become the inputs for the next layer.
- Finally, the output layer produces the network’s prediction.
Analogy:
- At the dinner, each family member gives their opinion (neuron output).
- Opinions are combined and passed on to the next group if needed (hidden layers).
- Finally, the family reaches a consensus (output layer).

4. Loss Function
To learn, the network needs to know how far off its predictions are. That’s what the loss function does.
- If the network says “Dinner is great” but the family actually thinks it’s just okay, the loss is high.
- The network will adjust to reduce this error.
Read about Loss Functions here.
5. Backpropagation
Backpropagation is how the network learns from mistakes.
- The network calculates how much each neuron contributed to the error (loss).
- Then it adjusts weights and biases slightly to reduce the loss next time.
Math Explanation (Simplified)

Analogy:
- If Mom’s opinion was too harsh or too lenient in the dinner rating, we slightly adjust how much we consider her opinion next time.
- Over many meals (iterations), the network learns the perfect combination of opinions to make the right judgment.

6. Activation Functions
Activation functions decide how strong a neuron’s output is.
- Sigmoid: squashes output between 0 and 1 (like yes/no opinions).
- ReLU: outputs zero if negative, or the input itself if positive (ignores weak opinions, emphasizes strong ones).
- Tanh: squashes between -1 and 1 (like positive/negative feelings).
Read about Activation Functions here.
A House Price Example
Imagine you are trying to predict the price of a house. Looking at just one detail, like the number of bedrooms, won’t tell you the full story. Instead, you have to combine different details to form bigger ideas that are easier to reason about.
For example, the age of the house and how many renovations it has gone through together give you a sense of how “new” the property feels. The locality, the number of schools nearby, and how dense the housing is combine into an idea of the “quality of the neighborhood.” Similarly, the area of the house, the number of bedrooms, and the number of bathrooms together represent its “size.”

Now these intermediate ideas like newness, neighborhood quality, and size become inputs for the final decision: the price of the house. That’s exactly what the hidden layer in a neural network does. It takes raw inputs and combines them into more meaningful features, which are then passed on to the next layer until the network produces an output.
Of course, in a real neural network you don’t decide these groupings yourself. You don’t tell the model to treat bedrooms, bathrooms, and area as “size.” The network figures that out on its own while training, by adjusting weights and biases. The diagram here is just a way to make it easier to imagine.
Another important detail is that in an actual neural network, almost every neuron in one layer is connected to every neuron in the next layer. This means the model doesn’t just look at inputs in neat little groups. Instead, it tries many different combinations and gradually learns which patterns matter most. Over time, it automatically discovers the right intermediate features that lead to accurate predictions.
Building a Simple Neural Network to Classify Handwritten Digits
We will use the MNIST dataset, which has 28×28 grayscale images of digits from 0 to 9. The goal is to train a neural network that can correctly predict which digit is in an image.
Step 1: Import Libraries
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
tensorflow
is the library that will help us build and train neural networks.Sequential
allows us to build a network layer by layer.Dense
is a fully connected layer (all neurons connected to previous layer).Flatten
converts the 2D image into a 1D vector for input.to_categorical
converts labels into one-hot vectors (like a “vote” for each digit).
Step 2: Load and Preprocess the Data
# Load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Normalize pixel values to 0-1
x_train = x_train / 255.0
x_test = x_test / 255.0
# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
- MNIST images have pixel values from 0 to 255. Dividing by 255 scales them to 0–1, which helps the network learn faster.
- Labels like 0,1,2…9 are converted to vectors of length 10. For example, label 3 becomes
[0,0,0,1,0,0,0,0,0,0]
. This helps the network “vote” for each digit. - Normalizing is like adjusting each family member’s opinion on the same scale so no one opinion is too loud or too quiet. One-hot encoding is like giving each dish its own checkbox to mark whether it was tasty or not.
Step 3: Build the Neural Network
model = Sequential([
Flatten(input_shape=(28, 28)), # Input layer: flatten 28x28 image
Dense(128, activation='relu'), # Hidden layer with 128 neurons
Dense(10, activation='softmax') # Output layer: 10 neurons for 10 digits
])
Flatten
converts the 2D image into a 1D array of 784 pixels.- The hidden layer has 128 neurons, each learning to detect patterns like lines, curves, or loops in the digits.
- The output layer has 10 neurons corresponding to digits 0–9. Softmax activation ensures all outputs sum to 1 (like probabilities).
Step 4: Compile the Model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
- Optimizer (
adam
): Helps adjust weights to reduce error efficiently. - Loss function (
categorical_crossentropy
): Measures how wrong the network’s predictions are. - Metrics (
accuracy
): Lets us see how many digits the model gets right.
Step 5: Train the Model
model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.1)
epochs=5
: The network will look at the entire training dataset 5 times.batch_size=32
: Updates weights after seeing 32 images at a time.validation_split=0.1
: 10% of training data is used to check progress without training on it.- Each epoch is like cooking dinner multiple times. The family learns from previous meals and gives better feedback each time.
- Batch size is like asking feedback from a few members at a time instead of the whole family at once.
Step 6: Evaluate the Model
model.evaluate(x_test, y_test)
- This checks how well the network performs on unseen data (test set).
- Test accuracy tells us what fraction of digits the network classified correctly.
Step 7: Make Predictions
import numpy as np
# Predict first 5 test images
predictions = model.predict(x_test[:5])
for i, pred in enumerate(predictions):
print(f"Image {i} prediction: {np.argmax(pred)}")
model.predict
outputs probabilities for each digit.np.argmax(pred)
picks the digit with the highest probability.
At this point, we have a fully working neural network that classifies handwritten digits.
Summary
Neural networks are systems of interconnected neurons that learn patterns from data. Each neuron applies weights, a bias, and an activation function to its inputs, passing information through layers to produce predictions.
They are particularly useful for problems where traditional programming fails, such as image recognition, speech understanding, and predictions. Networks learn by comparing predictions to true values using a loss function and adjusting weights via backpropagation.
In our example, we built a simple network using TensorFlow to classify handwritten digits. The network learned to extract meaningful features from raw images, combine them in hidden layers, and output accurate predictions.
Even in this simple form, neural networks demonstrate the power of learning from data, much like combining small opinions to reach a final decision, but now in a structured, mathematical way.
To visualize neural networks: [Video]
To learn how to implement neural networks: [Playlist]
To read about Gradient Descent: [Article]
“It’s not enough to learn how to ride, you must also learn how to fall.” — Mexican proverb
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!
Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Discover Your Dream AI Career at Towards AI Jobs
Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!
Note: Content contains the views of the contributing authors and not Towards AI.