Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

Understanding Neural Networks — and Building One!
Artificial Intelligence   Data Science   Latest   Machine Learning

Understanding Neural Networks — and Building One!

Last Updated on September 25, 2025 by Editorial Team

Author(s): Aditya Gupta

Originally published on Towards AI.

Why Do We Need Neural Networks?

Imagine trying to teach a computer to do something humans find easy like recognizing a face in a photo, understanding someone’s accent, or predicting which movie you’ll enjoy next. Traditional programming struggles here because these tasks don’t have strict rules. You can’t write an “if-else” for every possible scenario as there are just too many variations.

This is where neural networks come in. They’re designed to learn from examples instead of being told exactly what to do. Feed them enough data, and they can discover patterns that are too complex for humans to explicitly code.

Some things neural networks are especially good at:

  • Image recognition: Spotting faces, cats, or handwritten digits.
  • Speech and language: Understanding voice commands, translating languages, or generating text.
  • Predictions: Forecasting stock prices, weather, or even your next favorite song.
  • Medical diagnostics: Detecting diseases from scans or medical data.

In short: neural networks are needed for any problem where patterns are complicated, messy, or too numerous to define with rules. They’re the “learning brains” for computers, letting machines tackle tasks that were once thought impossible.

What are Neural Networks ?

Think about a normal family dinner at home. The food is served, and everyone at the table has their own opinion.

  • Your mom might notice if the vegetables are cooked fine.
  • Your dad cares whether the food feels filling.
  • Your sibling only judges based on how good the dessert is.
  • Your grandmother might focus on whether the food is light and healthy.

Each person looks at the same dinner but from a different angle. And when you combine all these opinions, you arrive at the final judgment: “Dinner was great.”

That’s how a neural network works.

  • Each “family member” is like a neuron, focusing on one part of the input.
  • Some opinions count more than others , just like weights in a network.
  • Put together, they form the final output.

In short: a neural network is nothing more than lots of small judgments coming together to make one big decision.

How Neural Networks Work

Think of a neural network as a big family dinner decision-making process. The goal is to decide if the meal is great or not, but instead of one person deciding, several family members give their opinions. Each opinion counts differently depending on who it is. This is very similar to how neural networks process information.

Understanding Neural Networks — and Building One!
Visual Representation of a Neural Network

1. The Neuron

At the core of a neural network is a neuron. A neuron takes some inputs, applies weights to them, adds a bias, and then passes the result through an activation function to produce an output.

  • In the family dinner analogy, a neuron is like one family member. They focus on a specific aspect of the dinner, for example, whether the dal has enough spice.
  • The weight is how important their opinion is. Maybe Maa’s opinion matters a lot, and your sibling’s opinion matters less.
  • The bias is a baseline tendency. Maybe Dadi always likes the food a bit more, so her input is slightly adjusted up.

Math Explanation

A single neuron can be written as:

Then the neuron applies an activation function fff to get the output:

  • For example, a sigmoid activation squashes the output between 0 and 1, like a yes/no opinion.
Source: https://medium.com/@ashwin3005/neural-networks-demystified-understanding-how-they-work-9206073071f8

2. Layers

Neurons are arranged in layers.

  • Input layer: The raw information about the dinner. For example, saltiness, aroma, portion size, sweetness. Each input goes to one neuron.
  • Hidden layer(s): Neurons that combine the inputs and process them further. These are like family members discussing among themselves before giving a final opinion.
  • Output layer: The final decision. For example, “Dinner was great” or “Dinner was okay.”

Math Explanation for a Layer

Suppose you have a layer of m neurons and n inputs:

3. Forward Pass

Forward pass is just feeding information through the network:

  1. Inputs come into the first layer.
  2. Each neuron calculates its output using its weights, bias, and activation.
  3. The outputs become the inputs for the next layer.
  4. Finally, the output layer produces the network’s prediction.

Analogy:

  • At the dinner, each family member gives their opinion (neuron output).
  • Opinions are combined and passed on to the next group if needed (hidden layers).
  • Finally, the family reaches a consensus (output layer).

4. Loss Function

To learn, the network needs to know how far off its predictions are. That’s what the loss function does.

  • If the network says “Dinner is great” but the family actually thinks it’s just okay, the loss is high.
  • The network will adjust to reduce this error.

Read about Loss Functions here.

5. Backpropagation

Backpropagation is how the network learns from mistakes.

  • The network calculates how much each neuron contributed to the error (loss).
  • Then it adjusts weights and biases slightly to reduce the loss next time.

Math Explanation (Simplified)

Analogy:

  • If Mom’s opinion was too harsh or too lenient in the dinner rating, we slightly adjust how much we consider her opinion next time.
  • Over many meals (iterations), the network learns the perfect combination of opinions to make the right judgment.
Source: https://medium.com/analytics-vidhya/backpropagation-for-dummies-e069410fa585

6. Activation Functions

Activation functions decide how strong a neuron’s output is.

  • Sigmoid: squashes output between 0 and 1 (like yes/no opinions).
  • ReLU: outputs zero if negative, or the input itself if positive (ignores weak opinions, emphasizes strong ones).
  • Tanh: squashes between -1 and 1 (like positive/negative feelings).

Read about Activation Functions here.

A House Price Example

Imagine you are trying to predict the price of a house. Looking at just one detail, like the number of bedrooms, won’t tell you the full story. Instead, you have to combine different details to form bigger ideas that are easier to reason about.

For example, the age of the house and how many renovations it has gone through together give you a sense of how “new” the property feels. The locality, the number of schools nearby, and how dense the housing is combine into an idea of the “quality of the neighborhood.” Similarly, the area of the house, the number of bedrooms, and the number of bathrooms together represent its “size.”

Now these intermediate ideas like newness, neighborhood quality, and size become inputs for the final decision: the price of the house. That’s exactly what the hidden layer in a neural network does. It takes raw inputs and combines them into more meaningful features, which are then passed on to the next layer until the network produces an output.

Of course, in a real neural network you don’t decide these groupings yourself. You don’t tell the model to treat bedrooms, bathrooms, and area as “size.” The network figures that out on its own while training, by adjusting weights and biases. The diagram here is just a way to make it easier to imagine.

Another important detail is that in an actual neural network, almost every neuron in one layer is connected to every neuron in the next layer. This means the model doesn’t just look at inputs in neat little groups. Instead, it tries many different combinations and gradually learns which patterns matter most. Over time, it automatically discovers the right intermediate features that lead to accurate predictions.

Building a Simple Neural Network to Classify Handwritten Digits

We will use the MNIST dataset, which has 28×28 grayscale images of digits from 0 to 9. The goal is to train a neural network that can correctly predict which digit is in an image.

Step 1: Import Libraries

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical
  • tensorflow is the library that will help us build and train neural networks.
  • Sequential allows us to build a network layer by layer.
  • Dense is a fully connected layer (all neurons connected to previous layer).
  • Flatten converts the 2D image into a 1D vector for input.
  • to_categorical converts labels into one-hot vectors (like a “vote” for each digit).

Step 2: Load and Preprocess the Data

# Load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to 0-1
x_train = x_train / 255.0
x_test = x_test / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
  • MNIST images have pixel values from 0 to 255. Dividing by 255 scales them to 0–1, which helps the network learn faster.
  • Labels like 0,1,2…9 are converted to vectors of length 10. For example, label 3 becomes [0,0,0,1,0,0,0,0,0,0]. This helps the network “vote” for each digit.
  • Normalizing is like adjusting each family member’s opinion on the same scale so no one opinion is too loud or too quiet. One-hot encoding is like giving each dish its own checkbox to mark whether it was tasty or not.

Step 3: Build the Neural Network

model = Sequential([
Flatten(input_shape=(28, 28)), # Input layer: flatten 28x28 image
Dense(128, activation='relu'), # Hidden layer with 128 neurons
Dense(10, activation='softmax') # Output layer: 10 neurons for 10 digits
])
  • Flatten converts the 2D image into a 1D array of 784 pixels.
  • The hidden layer has 128 neurons, each learning to detect patterns like lines, curves, or loops in the digits.
  • The output layer has 10 neurons corresponding to digits 0–9. Softmax activation ensures all outputs sum to 1 (like probabilities).

Step 4: Compile the Model

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
  • Optimizer (adam): Helps adjust weights to reduce error efficiently.
  • Loss function (categorical_crossentropy): Measures how wrong the network’s predictions are.
  • Metrics (accuracy): Lets us see how many digits the model gets right.

Step 5: Train the Model

model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.1)
  • epochs=5: The network will look at the entire training dataset 5 times.
  • batch_size=32: Updates weights after seeing 32 images at a time.
  • validation_split=0.1: 10% of training data is used to check progress without training on it.
  • Each epoch is like cooking dinner multiple times. The family learns from previous meals and gives better feedback each time.
  • Batch size is like asking feedback from a few members at a time instead of the whole family at once.

Step 6: Evaluate the Model

model.evaluate(x_test, y_test)
  • This checks how well the network performs on unseen data (test set).
  • Test accuracy tells us what fraction of digits the network classified correctly.

Step 7: Make Predictions

import numpy as np

# Predict first 5 test images
predictions = model.predict(x_test[:5])

for i, pred in enumerate(predictions):
print(f"Image {i} prediction: {np.argmax(pred)}")
  • model.predict outputs probabilities for each digit.
  • np.argmax(pred) picks the digit with the highest probability.

At this point, we have a fully working neural network that classifies handwritten digits.

Summary

Neural networks are systems of interconnected neurons that learn patterns from data. Each neuron applies weights, a bias, and an activation function to its inputs, passing information through layers to produce predictions.

They are particularly useful for problems where traditional programming fails, such as image recognition, speech understanding, and predictions. Networks learn by comparing predictions to true values using a loss function and adjusting weights via backpropagation.

In our example, we built a simple network using TensorFlow to classify handwritten digits. The network learned to extract meaningful features from raw images, combine them in hidden layers, and output accurate predictions.

Even in this simple form, neural networks demonstrate the power of learning from data, much like combining small opinions to reach a final decision, but now in a structured, mathematical way.

To visualize neural networks: [Video]

To learn how to implement neural networks: [Playlist]

To read about Gradient Descent: [Article]

“It’s not enough to learn how to ride, you must also learn how to fall.” — Mexican proverb

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.