Latent Diffusion Models: The Architecture behind Stable Diffusion
Author(s): Louis Bouchard

Originally published on Towards AI.

A High-Resolution Image Synthesis Architecture: Latent Diffusion

What do all recent super-powerful image models like DALLE, Imagen, or Midjourney have in common? Other than their high computing costs, huge training time, and shared hype, they are all based on the same mechanism: diffusion.

Diffusion models recently achieved state-of-the-art results for most image tasks, including text-to-image with DALLE but many other image generation-related tasks too, like image inpainting, style transfer, or image super-resolution.

There are a few downsides: they work sequentially on the whole image, meaning that both the training and inference times are expansive.

