Diffusers: Python Library for AI-Generated Images

Last Updated on July 17, 2023 by Editorial Team

Author(s): Muhammad Arham

Originally published on Towards AI.

This article shows the basic usage of HuggingFace’s diffuser library, which is used for AI-generated images through code.

Image generated using Diffusers Pipeline with Code

Introduction

The Diffusers library maintained by HuggingFace is a go-to library for Generative AI that provides multiple stable diffusion pipelines for images, audio, and several other useful functionalities.

SD-WebUI is deployed on Automatic1111 and is available on GitHub for people who prefer GUI. However, for deployment purposes, GUI may not be the best option, and people have been using the diffuser library to deploy full-blown Generative AI applications. This article will showcase the setup and usage of a diffuser library for generating images using Stable Diffusion.

Pre-requisites

Firstly, you need to set up a fresh environment for the use of the diffuser library. A fresh environment is not a necessity but it helps avoid dependency clashes between pre-installed packages and versions required by diffusers and associated libraries. You can use either a Python virtual environment or a Conda environment.

In the new environment, run the following commands to set up the required packages.

python -m pip install diffusers[torch]
python -m pip install transformers

For documentation and codebase, refer to the GitHub links for Diffusers and Transformers library. Both are currently maintained by HuggingFace, with new functionality and updates being pushed frequently.

Code

Basic text-to-image pipelines are provided by the Diffusers library. Other available pipelines, including Image-to-Image, Inpainting, and ControlNet, are used similarly. For this article, we will focus on the basic Stable Diffusion text-to-image pipeline.

Import Relevant Libraries

from diffusers import StableDiffusionPipeline
import torch

Setup Hyperparameters and Constants

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
PROMPT = 'hyperrealistic portrait of a man as astronaut, portrait, well lit, cyberpunk,'
MODEL_ID = 'stabilityai/stable-diffusion-2-1'

Device is used to select hardware devices. If an Nvidia GPU is detected in the system, it will be selected, which can speed up inference. In case a GPU is not available, inference can still be done on CPU hardware.

The prompt is the textual prompt that will be passed to the pipeline.

MODEL_ID is the pre-trained model that will be fetched from HuggingFace. Multiple models are provided for Stable Diffusion available for use on HuggingFace Model Hub. Stable Diffusion 2.1 is the most recent release that provides 768×768 output results. Other models available are Stable Diffusion 1.5, and 2.0. Moreover, other fine-tuned models are also available that are trained for specific styles, such as anime or realistic images.

Initialize Pipeline

pipe = StableDiffusionPipeline
 .from_pretrained(MODEL_ID, torch_dtype=torch.float16)
 .to(DEVICE)

We use torch float16 precision instead of the default float32. Using Half precision, we can reduce GPU utilization that provides efficient inference.

The from_pretrained method fetches the pre-trained model from HuggingFace. It downloads and caches all required modules such as Text Encoder, Unet, and Variational AutoEncoder, and returns a StableDiffusionPipeline object.

We then convert to pipe to the dedicated hardware that is to be used for inference. It will be either GPU or CPU.

Inference

result = pipe(PROMPT, num_inference_steps=50, guidance_scale=7).images[0]

We pass the required parameters for the forward call of the StableDiffusionPipeline object. It returns an object of type StableDiffusionOutput that is built in the diffusers library. It contains a list of generated images. For our use case, we only fetch the first generated image.

The num_inference_steps argument sets the total denoising steps used. A higher number provides better results as the Unet can denoise an image for longer.

The guidance_scale argument controls the prompt conditioning on the output image. A lower guidance scale means the model pays less attention to the prompt, so the output may not reflect the prompt provided. However, the model has more creative freedom so the generated image can showcase more variations. A higher guidance scale focuses more on the prompt provided.

The generated image is a PIL Image object. So the Pillow library can be used to save or post-process the image further.

result.save('result.png')

Output

Image generated using Diffusers Pipeline from Code

The result is a 768×768 dimension image based on the prompt provided. The exact dimension is based on the default size the pre-trained model was trained on. However, the dimension can be changed by using the height and width keyword argument during inference.

Complete Code

from diffusers import StableDiffusionPipeline
import torch

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
PROMPT = 'hyperrealistic portrait of a man as astronaut, portrait, well lit, cyberpunk,'
MODEL_ID = 'stabilityai/stable-diffusion-2-1'

pipe = StableDiffusionPipeline.from_pretrained(MODEL_ID, torch_dtype=torch.float16).to(DEVICE)

result = pipe(PROMPT, num_inference_steps=50, guidance_scale=7).images[0]
result.save('result.png')

Conclusion

The article highlighted the basic usage of the diffusers library that can be used for the deployment of applications related to Generative AI. There are multiple other pipelines available for different use cases. The API of each pipeline is similar, with simple changes in inference arguments making the required changes.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Diffusers: Python Library for AI-Generated Images

Author(s): Muhammad Arham

This article shows the basic usage of HuggingFace’s diffuser library, which is used for AI-generated images through code.

Introduction

Pre-requisites

Code

Import Relevant Libraries

Setup Hyperparameters and Constants

Initialize Pipeline

Inference

Output

Complete Code

Conclusion

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

LAI #66: Information Theory for People in a Hurry

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Diffusers: Python Library for AI-Generated Images

Author(s): Muhammad Arham

This article shows the basic usage of HuggingFace’s diffuser library, which is used for AI-generated images through code.

Introduction

Pre-requisites

Code

Import Relevant Libraries

Setup Hyperparameters and Constants

Initialize Pipeline

Inference

Output

Complete Code

Conclusion

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥