Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

GANs: How AI Creates Images from Noise
Artificial Intelligence   Latest   Machine Learning

GANs: How AI Creates Images from Noise

Last Updated on October 7, 2025 by Editorial Team

Author(s): Aditya Gupta

Originally published on Towards AI.

We all love pranks right. Think of a friend that you like to prank. The first time you prank them it works, you get a few laughs and maybe they get annoyed. The next time they won’t fall for the same prank because they learned from last time. So now you have to come up with a better prank.

Then maybe when you pull a better prank they actually get fooled again, but now they are even more cautious and smarter at spotting your pranks. Even then your friend has learned from the earlier prank and is more careful. They might see it coming and not get fooled. Then you try again with an even better prank and they also get better at spotting it. This keeps going and in the end both of you get better. You get better at pranking and your friend gets better at identifying your pranks.

It is kind of like Jim and Dwight from the office where Jim constantly pranks Dwight and they both get better each time.

GANs: How AI Creates Images from Noise

This is exactly how a GAN works. One party, the generator, produces an output. The other party, the discriminator, tries to figure out whether the output was made by the generator (so it’s fake) or if it is a real. They keep learning from each other and improving every round. By the end, the generator can produce outputs that look very real and the discriminator becomes very good at spotting fakes.

What Exactly is a GAN?

So now that we’ve seen the prank analogy, let’s talk about what a GAN really is. GAN stands for Generative Adversarial Network. It is basically a system made of two neural networks that compete with each other, kind of like a friendly rivalry.

One network is called the generator. Its job is to produce an output. The generator tries to make the output as realistic as possible so the other network cannot tell if it is fake.

The second network is called the discriminator. Its job is to figure out whether an output is real or if it was made by the generator. You can think of it like a friend trying to spot your pranks. The discriminator gets better every time it catches the generator.

The key idea is that these two networks are constantly learning from each other. The generator tries to fool the discriminator and the discriminator tries not to get fooled. Over time, both improve in their roles and the generator starts producing outputs that look very real.

GANs are unique because instead of learning in the usual way where you tell a network the right answer, they learn by competing. This competition is what makes them so powerful and creative.

Source: AWS

The Generator and How It Works

The generator is one part of the GAN and its main job is to produce outputs. You can think of it as the prankster in our earlier analogy. It starts with nothing, just random noise, and tries to create outputs that look real enough to fool the discriminator.

Intuition

Imagine you are trying to draw a face without looking at a photo. You start with random scribbles. At first, it looks nothing like a face. But each time you get feedback from someone who tells you what looks right and what looks wrong, you adjust. Over time, your drawings improve.

The generator works the same way. It takes random input z from a simple distribution, usually Gaussian or uniform, and transforms it into a sample G(z). At first, these outputs are just noise. But as the discriminator gives feedback, the generator learns to produce outputs that look closer to real data.

Generator Loss

The generator’s goal is to fool the discriminator. Think of it like trying to pull off a prank. You want your friend to believe your prank is real. The generator wants the discriminator to think its output is real.

The loss function tells the generator how well it did. The formula is:

Here’s what this means step by step:

  1. Random noise input: z∼Pz
  • This is just a random vector, like starting with a blank canvas or scribbles

2. Generate an output: G(z)

  • The generator turns the noise into something that looks like real data

3. Discriminator evaluation: D(G(z))

  • The discriminator looks at this output and gives a probability of it being real

4. Compute loss:

  • If the discriminator is fooled (thinks it’s real), D(G(z)) is high and the loss is low
  • If the discriminator is not fooled (thinks it’s fake), D(G(z)) is low and the loss is high

So the generator wants to minimize the loss, which is the same as maximizing the chance that the discriminator thinks its output is real.

Imagine you prank your friend, and they immediately see it’s fake. That’s a high loss. Next time, you make a better prank, and they might get fooled that’s a low loss. Each prank is like an iteration of learning.

Step-by-Step How It Learns

  1. Start with random noise: The generator receives a random vector z
  2. Produce an output: It transforms z into a sample G(z)
  3. Discriminator feedback: The discriminator evaluates the output and returns a probability of being real
  4. Compute loss: Using the formula above, the generator calculates how well it fooled the discriminator
  5. Backpropagate: The generator updates its weights to reduce the loss and improve future output.
  6. Repeat: Each iteration helps the generator produce outputs closer to real data

Neural Network Details

The generator is usually a neural network with layers that gradually upscale the noise into a structured output. For example:

  • Start with a 100-dimensional noise vector
  • Pass it through fully connected layers
  • Reshape and pass through deconvolutional (transpose convolution) layer.
  • Produce a final output, like a 64×64 or 128×128 image

The training happens using gradient descent, just like any other neural network. The gradient is computed from the generator loss, which depends on the discriminator’s evaluation. This is why the discriminator’s feedback is so crucial.

The Discriminator and How It Works

The discriminator is the second part of a GAN. Its main job is to figure out whether an output is real or produced by the generator. You can think of it as your friend who tries to spot your pranks. Each time the generator produces an output, the discriminator tries to detect if it’s fake.

Intuition

Imagine you are the friend being pranked. The first time, you might fall for it because you don’t know what to expect. But each time you catch a prank, you get better at spotting the tricks. That is exactly how the discriminator works.

It looks at the generator’s outputs and compares them to real data. Over time, it gets better at spotting fakes and giving meaningful feedback to the generator.

Discriminator Loss

The discriminator’s goal is to correctly classify real vs fake outputs. Its loss function is:

Step-by-step explanation:

  • x ~ p_data → a real data sample
  • z ~ p_z → random noise input to the generator
  • G(z) → output generated from the noise
  • D(x) → probability the discriminator assigns to a real sample being real
  • D(G(z)) → probability the discriminator assigns to a generated output being real

The discriminator wants to maximize D(x) for real data and minimize D(G(z)) for fake outputs. The combined loss tells it how well it did on both real and fake examples.

Analogy: Every time your friend catches a prank, they learn. The loss is high when they get fooled and low when they correctly identify the prank.

Step-by-Step How It Learns

  1. Receive real samples: The discriminator sees real data x
  2. Receive fake samples: The discriminator also sees generated outputs G(z)
  3. Classify each sample: It outputs a probability of each sample being real
  4. Compute loss: Using the discriminator loss formula
  5. Backpropagate: Update its weights to improve classification
  6. Repeat: Each iteration makes the discriminator better at spotting fakes

Without the discriminator, the generator would have no feedback and could not improve.

Neural Network Details

The discriminator is also a neural network. Its job is basically binary classification: real or fake. Typical architecture:

  • Input: A real or generated sample
  • Several convolutional layers to extract features (for images)
  • Fully connected layers to combine features
  • Output layer: Single neuron with sigmoid activation that outputs a probability

The discriminator learns through gradient descent using its own loss function. The better it becomes at spotting fakes, the more challenging it is for the generator, which helps both networks improve.

The Adversarial Game: How GANs Train

Now that we’ve seen the generator and the discriminator individually, let’s look at how they work together. This is the “game” part of Generative Adversarial Networks.

Think of it like the prank analogy:

  • You (the generator) try to pull pranks
  • Your friend (the discriminator) tries to spot them
  • Every time you improve your prank, your friend gets better at detecting it
  • Every time your friend gets better, you have to come up with an even smarter prank

This back-and-forth continues until both of you get really good at your jobs.

GAN Training Objective

Mathematically, GANs are formulated as a minimax game:

Step-by-step explanation:

  • D wants to maximize V(D, G)— it wants to correctly classify real and fake samples
  • G wants to minimize V(D, G) — it wants to fool D into thinking its outputs are real
  • Together, this forms a two-player game:
  • The generator improves to fool the discriminator
  • The discriminator improves to correctly identify fakes

Training Loop

Here’s how the adversarial training works in practice:

  • Sample real data x ~ p_data
  • Sample random noise z ~ p_z
  • Generate fake output G(z)
  • Update discriminator D using the loss
  • Update generator G using the loss
  • Repeat until both networks improve and converge
  1. Update generator $G$ using the loss:

2. Repeat until both networks converge

This is like repeatedly pranking and detecting pranks. The generator gets better at pranking, the discriminator gets better at spotting fakes, and eventually, both reach a point where the generator’s outputs are almost indistinguishable from real data.

Convergence

In theory, the game reaches a point called Nash Equilibrium, where:

  • The generator produces outputs that are indistinguishable from real data
  • The discriminator predicts 50–50 for real vs fake, meaning it can no longer reliably tell them apart

In practice, GANs are tricky to train and may oscillate or collapse, but with techniques like batch normalization, careful learning rates, and architectural tweaks, they can produce amazing results.

Source: https://newsletter.theaiedge.io/p/how-generative-adversarial-networks

Practical Examples and Applications of GANs

Now that we understand how GANs work, let’s see what they can actually do. GANs are not just theory , they are behind some of the coolest AI applications today.

1. Image Generation

GANs can create realistic images from random noise. You start with a vector of numbers that looks like nothing, and the generator transforms it into something that can fool the discriminator.

Think of it like taking a blank canvas and starting with random scribbles. Over time, the generator learns to turn those scribbles into something that looks like a real photo.

Example: Generating portraits of people who don’t exist. Websites like ThisPersonDoesNotExist.com use GANs to create realistic human faces that are completely fake.

2. Style Transfer

GANs can also change the style of images. For example, turning a regular photo into a painting in the style of Van Gogh or Picasso.

Imagine taking a selfie and turning it into a Starry Night painting. The generator learns the patterns of the target style and applies them to your photo.

3. Super-Resolution

GANs can enhance low-resolution images and make them high-resolution.

Think of an old blurry photo that you want to make sharp. The generator learns what high-resolution details should look like and produces a crisp image from the low-res input.

4. Deepfakes and Face Swaps

GANs are behind many deepfake videos. They can swap faces in videos realistically, making it look like someone else is doing or saying something they never actually did.

This is powerful, but also a reminder that GANs can be used for good and bad purposes. Always be careful and ethical with this technology.

5. Beyond Images

GANs are not limited to just pictures. They can be used to generate:

  • Music or audio samples
  • Synthetic medical scans for research
  • Molecules or chemical structures in drug discovery
  • Text-to-image generation for creative content

The core idea is always the same: the generator produces outputs, the discriminator judges them, and both improve through their back-and-forth game.

How to Code Your Own GAN to Generate MNIST Digits

In this example, the GAN will generate handwritten digit images similar to MNIST. The generator creates images from random noise, and the discriminator tries to distinguish real MNIST images from generated ones. Over time, both improve.

1. Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import torchvision
import matplotlib.pyplot as plt

We import PyTorch for building neural networks, torchvision for datasets and image utilities, and matplotlib to visualize generated images.

2. Set Device and Hyperparameters

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
latent_dim = 100 # Size of random noise vector
batch_size = 128
lr = 0.0002
epochs = 50
image_size = 28*28 # MNIST images are 28x28

We define the GPU/CPU device and hyperparameters for training, including the size of the noise vector, batch size, learning rate, number of epochs, and image size.

3. Load MNIST Dataset

transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

We normalize images to [-1, 1] and load the MNIST dataset. The Dataloader allows batch processing.

4. Define the Generator

class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
self.model = nn.Sequential(
nn.Linear(latent_dim, 256),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(256, 512),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(512, 1024),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(1024, image_size),
nn.Tanh()
)
def forward(self, z):
img = self.model(z)
return img

The generator takes random noise as input and produces a flattened image. Tanh activation ensures outputs are in [-1, 1]. It gradually expands the noise vector through layers to create structured images.

5. Define the Discriminator

class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
self.model = nn.Sequential(
nn.Linear(image_size, 512),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(512, 256),
nn.LeakyReLU(0.2, inplace=True),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, img):
validity = self.model(img)
return validity

The discriminator is a binary classifier. It takes a flattened image and outputs a probability of being real (1) or fake (0).

6. Initialize Models and Optimizers

generator = Generator().to(device)
discriminator = Discriminator().to(device)
adversarial_loss = nn.BCELoss()
optimizer_G = optim.Adam(generator.parameters(), lr=lr, betas=(0.5, 0.999))
optimizer_D = optim.Adam(discriminator.parameters(), lr=lr, betas=(0.5, 0.999))

We create the generator and discriminator instances and move them to the GPU if available. We use binary cross-entropy loss and Adam optimizer for both networks.

7. Training the GAN

for epoch in range(epochs):
for i, (imgs, _) in enumerate(train_loader):

real_imgs = imgs.view(imgs.size(0), -1).to(device)
real_labels = torch.ones(imgs.size(0), 1).to(device)
fake_labels = torch.zeros(imgs.size(0), 1).to(device)

# Train discriminator
optimizer_D.zero_grad()
z = torch.randn(imgs.size(0), latent_dim).to(device)
fake_imgs = generator(z)
real_loss = adversarial_loss(discriminator(real_imgs), real_labels)
fake_loss = adversarial_loss(discriminator(fake_imgs.detach()), fake_labels)
d_loss = real_loss + fake_loss
d_loss.backward()
optimizer_D.step()

# Train generator
optimizer_G.zero_grad()
g_loss = adversarial_loss(discriminator(fake_imgs), real_labels)
g_loss.backward()
optimizer_G.step()

print(f"Epoch [{epoch+1}/{epochs}] | D Loss: {d_loss.item():.4f} | G Loss: {g_loss.item():.4f}")

if (epoch + 1) % 10 == 0:
with torch.no_grad():
sample_z = torch.randn(16, latent_dim).to(device)
generated = generator(sample_z).view(-1, 1, 28, 28)
grid = torchvision.utils.make_grid(generated, nrow=4, normalize=True)
plt.imshow(grid.permute(1, 2, 0).cpu())
plt.show()
  • The discriminator learns to distinguish real from fake images.
  • The generator tries to fool the discriminator.
  • Losses are printed for monitoring progress.
  • Every 10 epochs, we visualize 16 generated digits, which should gradually start looking like real MNIST digits.

What This GAN Produces

  • Initially, the outputs are random noise.
  • After training, the generator produces handwritten digit images that look realistic.
  • Each epoch, the images get clearer as the generator learns from the discriminator’s feedback.

Challenges and Problems in GANs

GANs are super powerful, but they are also tricky to train. Unlike regular neural networks, GANs involve two networks learning at the same time, which can lead to a bunch of problems. Let’s go through the main ones.

1. Training Instability

GANs are like a continuous prank war. The generator tries to fool the discriminator, and the discriminator tries not to be fooled. If one of them gets too strong too quickly, training becomes unstable.

What can happen:

  • Losses may oscillate wildly.
  • The generator may produce outputs that don’t improve over time.
  • Sometimes the network never converges, and images remain noisy.

2. Mode Collapse

This is when the generator starts producing very similar outputs all the time. It has found a “shortcut” that fools the discriminator but fails to capture the diversity of real data.

Example:

  • On MNIST, the generator might produce only the digit “3” repeatedly.
  • On faces, it could generate very similar faces over and over.

3. Vanishing Gradients

If the discriminator becomes too good too fast, it can confidently reject all generated outputs. This gives the generator almost no feedback, so it cannot improve.

Imagine pranking someone who has learned to spot every trick instantly. You get no feedback on how to improve your prank, so your learning stalls.

4. Sensitive Hyperparameters

GANs are very sensitive to choices like:

  • Learning rates for generator and discriminator
  • Batch size
  • Network architecture and number of layers

What can happen:

  • Too high learning rate → training diverges
  • Too low learning rate → generator improves too slowly
  • Imbalanced architectures → one network dominates the other

5. Evaluation Difficulty

It’s hard to measure how good a GAN is quantitatively. Unlike supervised learning, there’s no single “accuracy” score.

What can happen:

  • Visual inspection is often required.
  • Metrics like FID (Fréchet Inception Distance) or IS (Inception Score) exist, but they are not perfect.

6. Overfitting and Generalization

If the generator memorizes training data instead of learning the underlying patterns, it can overfit. This means:

  • Outputs may look real but are just copies of training images.
  • The GAN fails to generate truly new and diverse samples.

Types of GANs

Over the years, researchers have developed many variants of GANs to solve specific problems or improve training stability. Here are some of the most common types:

  1. DCGAN (Deep Convolutional GAN)
    Uses convolutional layers instead of fully connected layers. Great for generating images with better quality and stability.
  2. Conditional GAN (cGAN)
    Allows you to control the output by giving a label or condition. For example, generating a digit “7” instead of a random digit.
  3. Wasserstein GAN (WGAN)
    Improves training stability by using a different loss function based on Wasserstein distance. Helps prevent mode collapse and vanishing gradients.
  4. CycleGAN
    Specialized for image-to-image translation without paired data. Example: turning horses into zebras or summer photos into winter scenes.
  5. StyleGAN
    Generates high-quality, realistic images, especially faces. Introduces style layers to control features like hair, pose, and background.

Real-World Applications of GANs and Industry Alternatives

Generative Adversarial Networks (GANs) have evolved from academic experiments to integral tools in various industries. Here’s how they’re applied:

GANs in Industry

  • Zalando: This European fashion retailer employs AI to accelerate content production for marketing campaigns. By utilizing AI to create imagery and digital twins of models, Zalando has significantly reduced image production times and costs.
  • L’Oréal: In collaboration with Nvidia, L’Oréal enhances its AI initiatives, including AI-generated advertising and personalized product recommendations. This partnership aims to boost the speed, accuracy, and security of L’Oréal’s AI applications.
  • Adobe: Integrates GANs into creative software tools like Photoshop, providing designers and artists with new options for content creation and enhancement.

Limitations of GANs

Despite their capabilities, GANs face challenges:

  • Training Instability: GANs can be difficult to train, often requiring careful tuning and large datasets.
  • Mode Collapse: The generator might produce limited varieties of outputs, reducing diversity.
  • Evaluation Metrics: Assessing the quality of generated content can be subjective and lacks standardized metrics.

Industry Alternatives

While GANs are powerful, other models have emerged that are more suitable for certain applications:

  • Diffusion Models: Companies like OpenAI utilize diffusion models in their generative image systems, such as DALL·E 2 and DALL·E 3, to create high-quality images from text descriptions.
  • Transformers: Widely used in natural language processing tasks, transformers have also been adapted for image generation, offering flexibility and scalability.
  • Variational Autoencoders (VAEs): VAEs are utilized for tasks requiring smooth latent space representations and are often combined with other models for enhanced performance.

Conclusion

GANs show how AI can create realistic outputs by having two networks learn from each other. From understanding the generator and discriminator to exploring the math, coding your own GAN, and seeing real-world applications, we get a full picture of how these models work.

They are powerful but not perfect. Training can be tricky and outputs may lack diversity. That is why in industry, diffusion models and transformers are often preferred for large-scale, reliable generation.

Learning GANs helps you build a strong foundation in generative AI and gives insight into the models behind many creative and practical AI applications today.

To visualize GANs : [Video]

To read about Transfer Learning: [Article]

Follow for more explanations and to make artificial intelligence feel real !

“If we don’t sacrifice for what we want, what we want becomes the sacrifice”

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.