Releases · huggingface/diffusers

25 Nov 17:17

anton-l

v0.9.0

6b02323

v0.9.0: Stable Diffusion 2

🎨 Stable Diffusion 2 is here!

Installation

pip install diffusers[torch]==0.9 transformers

Stable Diffusion 2.0 is available in several flavors:

Stable Diffusion 2.0-V at `768x768`

New stable diffusion model (Stable Diffusion 2.0-v) at 768x768 resolution. Same number of parameters in the U-Net as 1.5, but uses OpenCLIP-ViT/H as the text encoder and is trained from scratch. SD 2.0-v is a so-called v-prediction model.

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, guidance_scale=9, num_inference_steps=25).images[0]
image.save("astronaut.png")

Stable Diffusion 2.0-base at `512x512`

The above model is finetuned from SD 2.0-base, which was trained as a standard noise-prediction model on 512x512 images and is also made available.

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "stabilityai/stable-diffusion-2-base"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "High quality photo of an astronaut riding a horse in space"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("astronaut.png")

Stable Diffusion 2.0 for Inpanting

This model for text-guided inpanting is finetuned from SD 2.0-base. Follows the mask-generation strategy presented in LAMA which, in combination with the latent VAE representations of the masked image, are used as an additional conditioning.

import PIL
import requests
import torch
from io import BytesIO
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

repo_id = "stabilityai/stable-diffusion-2-inpainting"
pipe = DiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, revision="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, num_inference_steps=25).images[0]
image.save("yellow_cat.png")

Stable Diffusion X4 Upscaler

The model was trained on crops of size 512x512 and is a text-guided latent upscaling diffusion model. In addition to the textual input, it receives a noise_level as an input parameter, which can be used to add noise to the low-resolution input according to a predefined diffusion schedule.

import requests
from PIL import Image
from io import BytesIO
from diffusers import StableDiffusionUpscalePipeline
import torch

model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipeline = StableDiffusionUpscalePipeline.from_pretrained(model_id, revision="fp16", torch_dtype=torch.float16)
pipeline = pipeline.to("cuda")

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
response = requests.get(url)
low_res_img = Image.open(BytesIO(response.content)).convert("RGB")
low_res_img = low_res_img.resize((128, 128))

prompt = "a white cat"
upscaled_image = pipeline(prompt=prompt, image=low_res_img).images[0]
upscaled_image.save("upsampled_cat.png")

Saving & Loading is fixed for Versatile Diffusion

Previously there was a 🐛 when saving & loading versatile diffusion - this is fixed now so that memory efficient saving & loading works as expected.

[Versatile Diffusion] Fix remaining tests by @patrickvonplaten in #1418

📝 Changelog

add v prediction by @patil-suraj in #1386
Adapt UNet2D for supre-resolution by @patil-suraj in #1385
Version 0.9.0.dev0 by @anton-l in #1394
Make height and width optional by @patrickvonplaten in #1401
[Config] Add optional arguments by @patrickvonplaten in #1395
Upscaling fixed by @patrickvonplaten in #1402
Add the new SD2 attention params to the VD text unet by @anton-l in #1400
Deprecate sample size by @patrickvonplaten in #1406
Support SD2 attention slicing by @anton-l in #1397
Add SD2 inpainting integration tests by @anton-l in #1412
Fix sample size conversion script by @patrickvonplaten in #1408
fix clip guided by @patrickvonplaten in #1414
Fix all stable diffusion by @patrickvonplaten in #1415
[MPS] call contiguous after permute by @kashif in #1411
Deprecate predict_epsilon by @pcuenca in #1393
Fix ONNX conversion and inference by @anton-l in #1416
Allow to set config params directly in init by @patrickvonplaten in #1419
Add tests for Stable Diffusion 2 V-prediction 768x768 by @anton-l in #1420
StableDiffusionUpscalePipeline by @patil-suraj in #1396
added initial v-pred support to DPM-solver by @kashif in #1421
SD2 docs by @patrickvonplaten in #1424

Contributors

kashif, pcuenca, and 3 other contributors

Assets 2

24 Nov 00:09

anton-l

v0.8.1

3876b20

v0.8.1: Patch release

This patch release fixes an error with CLIPVisionModelWithProjection imports on a non-git transformers installation.

⚠️ Please upgrade with pip install --upgrade diffusers or pip install diffusers==0.8.1

[Bad dependencies] Fix imports (#1382) by @patrickvonplaten

Contributors

patrickvonplaten

Assets 2

23 Nov 18:23

patil-suraj

v0.8.0

16a32c9

v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model

🙆‍♀️ New Models

VersatileDiffusion

VersatileDiffusion, released by SHI-Labs, is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.

[Versatile Diffusion] Add versatile diffusion model by @patrickvonplaten @anton-l #1283
Make sure to install transformers from "main":

pip install git+https://github.com/huggingface/transformers

Then you can run:

from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image

pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

# initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")

# prompt
prompt = "a red car"

# text to image
image = pipe.text_to_image(prompt).images[0]

# image variation
image = pipe.image_variation(image).images[0]

# image variation
image = pipe.dual_guided(prompt, image).images[0]

AltDiffusion

AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.

Add AltDiffusion by @patrickvonplaten @patil-suraj #1299

Stable Diffusion Image Variations

StableDiffusionImageVariationPipeline by @justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.

StableDiffusionImageVariationPipeline by @patil-suraj #1365

Safe Latent Diffusion

Safe Latent Diffusion (SLD), released by ml-research@TUDarmstadt group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to diffusers.

Add Safe Stable Diffusion Pipeline by @manuelbrack #1244

VQ-Diffusion with classifier-free sampling

vq diffusion classifier free sampling by @williamberman #1294

LDM super resolution

LDM super resolution is a latent 4x super-resolution diffusion model released by CompVis.

Add LDM Super Resolution pipeline by @duongna21 #1116

CycleDiffusion

CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of

Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.

Add CycleDiffusion pipeline using Stable Diffusion by @ChenWu98 #888

CLIPSeg + StableDiffusionInpainting.

Uses CLIPSeg to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.

K-Diffusion wrapper

K-Diffusion Pipeline is community pipeline that allows to use any sampler from K-diffusion with diffusers models.

[Community Pipelines] K-Diffusion Pipeline by @patrickvonplaten #1360

🌀New SOTA Scheduler

DPMSolverMultistepScheduler is the 🧨 diffusers implementation of DPM-Solver++, a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.

You can use it like this:

from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)

🌐 Better scheduler API

The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.

You can load a scheduler using from_pretrained, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:

from diffusers import DiffusionPipeline, EulerDiscreteScheduler

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

Read more about these changes in the documentation. See also the community pipeline that allows using any of the K-diffusion samplers with diffusers, as mentioned above!

🎉 Performance

We work relentlessly to incorporate performance optimizations and memory reduction techniques to 🧨 diffusers. These are two of the most noteworthy incorporations in this release:

Enable memory-efficient attention by default if xFormers is installed.
Use batched-matmuls when possible.

🎁 Quality of Life improvements

Fix/Enable all schedulers for in-painting
Easier loading of local pipelines
cpu offloading: mutli GPU support

📝 Changelog

Add multistep DPM-Solver discrete scheduler by @LuChengTHU in #1132
Remove warning about half precision on MPS by @pcuenca in #1163
Fix typo latens -> latents by @duongna21 in #1171
Fix community pipeline links by @pcuenca in #1162
[Docs] Add loading script by @patrickvonplaten in #1174
Fix dtype safety checker inpaint legacy by @patrickvonplaten in #1137
Community pipeline img2img inpainting by @vvvm23 in #1114
[Community Pipeline] Add multilingual stable diffusion to community pipelines by @juancopi81 in #1142
[Flax examples] Load text encoder from subfolder by @duongna21 in #1147
Link to Dreambooth blog post instead of W&B report by @pcuenca in #1180
Fix small typo by @pcuenca in #1178
[DDIMScheduler] fix noise device in ddim step by @patil-suraj in #1189
MPS schedulers: don't use float64 by @pcuenca in #1169
Warning for invalid options without "--with_prior_preservation" by @shirayu in #1065
[ONNX] Improve ONNXPipeline scheduler compatibility, fix safety_checker by @anton-l in #1173
Restore compatibility with deprecated StableDiffusionOnnxPipeline by @pcuenca in #1191
Update pr docs actions by @mishig25 in #1194
handle dtype xformers attention by @patil-suraj in #1196
[Scheduler] Move predict epsilon to init by @patrickvonplaten in #1155
add licenses to pipelines by @natolambert in #1201
Fix cpu offloading by @anton-l in #1177
Fix slow tests by @patrickvonplaten in #1210
[Flax] fix extra copy pasta 🍝 by @camenduru in #1187
[CLIPGuidedStableDiffusion] support DDIM scheduler by @patil-suraj in #1190
Fix layer names convert LDM script by @duongna21 in #1206
[Loading] Make sure loading edge cases work by @patrickvonplaten in #1192
Add LDM Super Resolution pipeline by @duongna21 in #1116
[Conversion] Improve conversion script by @patrickvonplaten in #1218
DDIM docs by @patrickvonplaten in #1219
apply repeat_interleave fix for mps to stable diffusion image2image pipeline by @jncasey in #1135
Flax tests: don't hardcode number of devices by @pcuenca in #1175
Improve documentation for the LPW pipeline by @exo-pla-net in #1182
Factor out encode text with Copied from by @patrickvonplaten in #1224
Match the generator device to the pipeline for DDPM and DDIM by @anton-l in #1222
[Tests] Fix mps+generator fast tests by @anton-l in #1230
[Tests] Adjust TPU test values by @anton-l in #1233
Add a reference to the name 'Sampler' by @apolinario in #1172
Fix Flax usage comments by @pcuenca in #1211
[Docs] improve img2img example by @ruanrz in #1193
[Stable Diffusion] Fix padding / truncation by @patrickvonplaten in #1226
Finalize stable diffusion refactor by @patrickvonplaten in #1269
Edited attention.py for older xformers by @Lime-Cakes in #1270
Fix wrong link in text2img fine-tuning documentation by @daspartho in #1282
[StableDiffusionInpaintPipeline] fix batch_size for mask and masked latents by @patil-suraj in #1279
Add UNet 1d for RL model for planning + colab by @natolambert in #105
Fix documentation typo for UNet2DModel and UNet2DConditionModel by @xenova in #1275
add source link to composable diffusion model by @nanliu1 in #1293
Fix incorrect link to Stable Diffusion notebook by @dhruvrnaik in #1291
[dreambooth] link to bitsandbytes readme for installation by @0xdevalias in #1229
Add Scheduler.from_pretrained and better scheduler changing by @patrickvonplaten in #1286
Add AltDiffusion by @patrickvonplaten in #1299
Better error messag...

Contributors

ctsims, justinpinkney, and 39 other contributors

Assets 2

05 Nov 21:45

patrickvonplaten

v0.7.2

91592ed

v0.7.2: Patch release

This patch release fixes a bug that broken the Flax Stable Diffusion Inference.
Thanks a mille for spotting it @camenduru in #1145 and thanks a lot to @pcuenca and @kashif for fixing it in #1149

Flax: Flip sin to cos in time embeddings #1149 by @pcuenca

Contributors

kashif, pcuenca, and camenduru

Assets 2

04 Nov 14:22

anton-l

v0.7.1

2ea5eeb

v0.7.1: Patch release

This patch release makes accelerate a soft dependency to avoid an error when installing diffusers with pre-existing torch.

Move accelerate to a soft-dependency #1134 by @patrickvonplaten

Contributors

patrickvonplaten

Assets 2

03 Nov 18:44

patrickvonplaten

v0.7.0

bde4880

v0.7.0: Optimized for Apple Silicon, Improved Performance, Awesome Community

❤️ PyTorch + Accelerate

⚠️ The PyTorch pipelines now require accelerate for improved model loading times!
Install Diffusers with pip install --upgrade diffusers[torch] to get everything in a single command.

🍎 Apple Silicon support with PyTorch 1.13

PyTorch and Apple have been working on improving mps support in PyTorch 1.13, so Apple Silicon is now a first-class citizen in diffusers 0.7.0!

Requirements

Mac computer with Apple silicon (M1/M2) hardware.
macOS 12.6 or later (13.0 or later recommended, as support is even better).
arm64 version of Python.
PyTorch 1.13.0 official release, installed from pip or the conda channels.

Memory efficient generation

Memory management is crucial to achieve fast generation speed. We recommend to always use attention slicing on Apple Silicon, as it drastically reduces memory pressure and prevents paging or swapping. This is especially important for computers with less than 64 GB of Unified RAM, and may be the difference between generating an image in seconds rather than in minutes. Use it like this:

from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipe = pipe.to("mps")

# Recommended if your computer has < 64 GB of RAM
pipe.enable_attention_slicing()

prompt = "a photo of an astronaut riding a horse on mars"

# First-time "warmup" pass
_ = pipe(prompt, num_inference_steps=1)

image = pipe(prompt).images[0]
image.save("astronaut.png")

Continuous Integration

Our automated tests now include a full battery of tests on the mps device. This will be helpful to identify issues early and ensure the quality on Apple Silicon going forward.

See more details in the documentation.

💃 Dance Diffusion

diffusers goes audio 🎵 Dance Diffusion by Harmonai is the first audio model in 🧨Diffusers!

[Dance Diffusion] Add dance diffusion by @patrickvonplaten #803

Try it out to generate some random music:

from diffusers import DiffusionPipeline
import scipy

model_id = "harmonai/maestro-150k"
pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline = pipeline.to("cuda")

audio = pipeline(audio_length_in_s=4.0).audios[0]

# To save locally
scipy.io.wavfile.write("maestro_test.wav", pipe.unet.sample_rate, audio.transpose())

🎉 Euler schedulers

These are the Euler schedulers, from the paper Elucidating the Design Space of Diffusion-Based Generative Models by Karras et al. (2022). The diffusers implementation is based on the original k-diffusion implementation by Katherine Crowson. The Euler schedulers are fast, often times generating really good outputs with 20-30 steps.

k-diffusion-euler by @hlky #1019

from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

euler_scheduler = EulerDiscreteScheduler.from_config("runwayml/stable-diffusion-v1-5", subfolder="scheduler")
pipeline = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", scheduler=euler_scheduler, revision="fp16", torch_dtype=torch.float16
)
pipeline.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipeline(prompt, num_inference_steps=25).images[0]

from diffusers import StableDiffusionPipeline, EulerAncestralDiscreteScheduler

euler_ancestral_scheduler = EulerAncestralDiscreteScheduler.from_config("runwayml/stable-diffusion-v1-5", subfolder="scheduler")
pipeline = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", scheduler=euler_scheduler, revision="fp16", torch_dtype=torch.float16
)
pipeline.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipeline(prompt, num_inference_steps=25).images[0]

🔥 Up to 2x faster inference with `memory_efficient_attention`

Even faster and memory efficient stable diffusion using the efficient flash attention implementation from xformers

Up to 2x speedup on GPUs using memory efficient attention by @MatthieuTPHR #532

To leverage it just make sure you have:

PyTorch > 1.12
Cuda available
Installed the xformers library

from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    revision="fp16",
    torch_dtype=torch.float16,
).to("cuda")

pipe.enable_xformers_memory_efficient_attention()

with torch.inference_mode():
    sample = pipe("a small cat")

# optional: You can disable it via
# pipe.disable_xformers_memory_efficient_attention()

🚀 Much faster loading

Thanks to accelerate, pipeline loading is much, much faster. There are two parts to it:

First, when a model is created PyTorch initializes its weights by default. This takes a good amount of time. Using low_cpu_mem_usage (enabled by default), no initialization will be performed.
Optionally, you can also use device_map="auto" to automatically select the best device(s) where the pre-trained weights will be initially sent to.

In our tests, loading time was more than halved on CUDA devices, and went down from 12s to 4s on an Apple M1 computer.

As a side effect, CPU usage will be greatly reduced during loading, because no temporary copies of the weights are necessary.

This feature requires PyTorch 1.9 or better and accelerate 0.8.0 or higher.

🎨 RePaint

RePaint allows to reuse any pretrained DDPM model for free-form inpainting by adding restarts to the denoising schedule. Based on the paper RePaint: Inpainting using Denoising Diffusion Probabilistic Models by Andreas Lugmayr et al.

from diffusers import RePaintPipeline, RePaintScheduler

# Load the RePaint scheduler and pipeline based on a pretrained DDPM model
scheduler = RePaintScheduler.from_config("google/ddpm-ema-celebahq-256")
pipe = RePaintPipeline.from_pretrained("google/ddpm-ema-celebahq-256", scheduler=scheduler)
pipe = pipe.to("cuda")

generator = torch.Generator(device="cuda").manual_seed(0)
output = pipe(
    original_image=original_image,
    mask_image=mask_image,
    num_inference_steps=250,
    eta=0.0,
    jump_length=10,
    jump_n_sample=10,
    generator=generator,
)
inpainted_image = output.images[0]

🌍 Community Pipelines

Long Prompt Weighting Stable Diffusion

The Pipeline lets you input prompt without 77 token length limit. And you can increase words weighting by using "()" or decrease words weighting by using "[]". The Pipeline also lets you use the main use cases of the stable diffusion pipeline in a single class.
For a code example, see Long Prompt Weighting Stable Diffusion

[Community Pipelines] Long Prompt Weighting Stable Diffusion Pipelines by @SkyTNT in #907

Speech to Image

Generate an image from an audio sample using pre-trained OpenAI whisper-small and Stable Diffusion.
For a code example, see Speech to Image

[Examples] add speech to image pipeline example by @MikailINTech in #897

Wildcard Stable Diffusion

A minimal implementation that allows for users to add "wildcards", denoted by __wildcard__ to prompts that are used as placeholders for randomly sampled values given by either a dictionary or a .txt file.
For a code example, see Wildcard Stable Diffusion

Wildcard stable diffusion pipeline by @shyamsn97 in #900

Composable Stable Diffusion

Use logic operators to do compositional generation.
For a code example, see Composable Stable Diffusion

Add Composable diffusion to community pipeline examples by @MarkRich in #951

Imagic Stable Diffusion

Image editing with Stable Diffusion.
For a code example, see Imagic Stable Diffusion

Add imagic to community pipelines by @MarkRich in #958

Seed Resizing

Allows to generate a larger image while keeping the content of the original image.
For a code example, see Seed Resizing

Add seed resizing to community pipelines by @MarkRich in #1011

📝 Changelog

[Community Pipelines] Long Prompt Weighting Stable Diffusion Pipelines by @SkyTNT in #907
[Stable Diffusion] Add components function by @patrickvonplaten in #889
[PNDM Scheduler] Make sure list cannot grow forever by @patrickvonplaten in #882
[DiffusionPipeline.from_pretrained] add warning when passing unused k… by @patrickvonplaten in #870
DOC Dreambooth Add --sample_batch_size=1 to the 8 GB dreambooth example script by @leszekhanusz in #829
[Examples] add speech to image pipeline example by @MikailINTech in #897
[dreambooth] dont use safety check when generating prior images by @patil-suraj in #922
Dreambooth class image generation: ...

Contributors

kashif, apolinario, and 36 other contributors

Assets 2

19 Oct 15:52

anton-l

v0.6.0

ad9d7ce

v0.6.0: Finetuned Stable Diffusion inpainting

🎨 Finetuned Stable Diffusion inpainting

The first official stable diffusion checkpoint fine-tuned on inpainting has been released.

You can try it out in the official demo here

or code it up yourself 💻 :

from io import BytesIO

import torch

import PIL
import requests
from diffusers import StableDiffusionInpaintPipeline


def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")


img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))

pipe = StableDiffusionInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    revision="fp16",
    torch_dtype=torch.float16,
)
pipe.to("cuda")

prompt = "Face of a yellow cat, high resolution, sitting on a park bench"

output = pipe(prompt=prompt, image=image, mask_image=mask_image)
image = output.images[0]

gives:

`image`	`mask_image`	`prompt`		Output
		Face of a yellow cat, high resolution, sitting on a park bench	=>

⚠️ This release deprecates the unsupervised noising-based inpainting pipeline into StableDiffusionInpaintPipelineLegacy.
The new StableDiffusionInpaintPipeline is based on a Stable Diffusion model finetuned for the inpainting task: https://huggingface.co/runwayml/stable-diffusion-inpainting

Note
When loading StableDiffusionInpaintPipeline with a non-finetuned model (i.e. the one saved with diffusers<=0.5.1), the pipeline will default to StableDiffusionInpaintPipelineLegacy, to maintain backward compatibility ✨

from diffusers import StableDiffusionInpaintPipeline

pipe = StableDiffusionInpaintPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

assert pipe.__class__ .__name__ == "StableDiffusionInpaintPipelineLegacy"

Context:

Why this change? When Stable Diffusion came out ~2 months ago, there were many unofficial in-painting demos using the original v1-4 checkpoint ("CompVis/stable-diffusion-v1-4"). These demos worked reasonably well, so that we integrated an experimental StableDiffusionInpaintPipeline class into diffusers. Now that the official inpainting checkpoint was released: https://github.com/runwayml/stable-diffusion we decided to make this our official pipeline and move the old / hacky one to "StableDiffusionInpaintPipelineLegacy".

🚀 ONNX pipelines for image2image and inpainting

Thanks to the contribution by @zledas (#552) this release supports OnnxStableDiffusionImg2ImgPipeline and OnnxStableDiffusionInpaintPipeline optimized for CPU inference:

from diffusers import OnnxStableDiffusionImg2ImgPipeline, OnnxStableDiffusionInpaintPipeline

img_pipeline = OnnxStableDiffusionImg2ImgPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4", revision="onnx", provider="CPUExecutionProvider"
)

inpaint_pipeline = OnnxStableDiffusionInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-inpainting", revision="onnx", provider="CPUExecutionProvider"
)

🌍 Community Pipelines

Two new community pipelines have been added to diffusers 🔥

Stable Diffusion Interpolation example

Interpolate the latent space of Stable Diffusion between different prompts/seeds.
For more info see stable-diffusion-videos.

For a code example, see Stable Diffusion Interpolation

Add Stable Diffusion Interpolation Example by @nateraw in #862

Stable Diffusion Interpolation Mega

One Stable Diffusion Pipeline with all functionalities of Text2Image, Image2Image and Inpainting

For a code example, see Stable Diffusion Mega

All in one Stable Diffusion Pipeline by @patrickvonplaten in #821

📝 Changelog

[Community] One step unet by @patrickvonplaten in #840
Remove unneeded use_auth_token by @osanseviero in #839
Bump to 0.6.0.dev0 by @anton-l in #831
Remove the last of ["sample"] by @anton-l in #842
Fix Flax pipeline: width and height are ignored #838 by @camenduru in #848
[DeviceMap] Make sure stable diffusion can be loaded from older trans… by @patrickvonplaten in #860
Fix small community pipeline import bug and finish README by @patrickvonplaten in #869
Fix training push_to_hub (unconditional image generation): models were not saved before pushing to hub by @pcuenca in #868
Fix table in community README.md by @nateraw in #879
Add generic inference example to community pipeline readme by @apolinario in #874
Rename frame filename in interpolation community example by @nateraw in #881
Add Apple M1 tests by @anton-l in #796
Fix autoencoder test by @pcuenca in #886
Rename StableDiffusionOnnxPipeline -> OnnxStableDiffusionPipeline by @anton-l in #887
Fix DDIM on Windows not using int64 for timesteps by @hafriedlander in #819
[dreambooth] allow fine-tuning text encoder by @patil-suraj in #883
Stable Diffusion image-to-image and inpaint using onnx. by @zledas in #552
Improve ONNX img2img numpy handling, temporarily fix the tests by @anton-l in #899
[Stable Diffusion Inpainting] Deprecate inpainting pipeline in favor of official one by @patrickvonplaten in #903
[Communit Pipeline] Make sure "mega" uses correct inpaint pipeline by @patrickvonplaten in #908
Stable diffusion inpainting by @patil-suraj in #904
ONNX supervised inpainting by @anton-l in #906

Contributors

hafriedlander, apolinario, and 8 other contributors

Assets 2

13 Oct 19:24

patrickvonplaten

v0.5.1

3538e03

v0.5.1: Patch release

This patch release fixes an bug with Flax's NFSW safety checker in the pipeline.

#832 by @patil-suraj

Contributors

patil-suraj

Assets 2

13 Oct 17:54

anton-l

v0.5.0

0679d09

v0.5.0: JAX/Flax and TPU support

🌾 JAX/Flax integration for super fast Stable Diffusion on TPUs.

We added JAX support for Stable Diffusion! You can now run Stable Diffusion on Colab TPUs (and GPUs too!) for faster inference.

Check out this TPU-ready colab for a Stable Diffusion pipeline:
And a detailed blog post on Stable Diffusion and parallelism in JAX / Flax 🤗 https://huggingface.co/blog/stable_diffusion_jax

The most used models, schedulers and pipelines have been ported to JAX/Flax, namely:

Models: FlaxAutoencoderKL, FlaxUNet2DConditionModel
Schedulers: FlaxDDIMScheduler, FlaxDDIMScheduler, FlaxPNDMScheduler
Pipelines: FlaxStableDiffusionPipeline

Changelog:

Implement FlaxModelMixin #493 by @mishig25 , @patil-suraj, @patrickvonplaten , @pcuenca
Karras VE, DDIM and DDPM flax schedulers #508 by @kashif
initial flax pndm scheduler #492 by @kashif
FlaxDiffusionPipeline & FlaxStableDiffusionPipeline #559 by @mishig25 , @patrickvonplaten , @pcuenca
Flax pipeline pndm #583 by @pcuenca
Add from_pt argument in .from_pretrained #527 by @younesbelkada
Make flax from_pretrained work with local subfolder #608 by @mishig25

🔥 DeepSpeed low-memory training

Thanks to the 🤗 accelerate integration with DeepSpeed, a few of our training examples became even more optimized in terms of VRAM and speed:

DreamBooth is now trainable on 8GB GPUs thanks to a contribution from @Ttl! Find out how to run it here.
The Text2Image finetuning example is also fully compatible with DeepSpeed.

✏️ Changelog

Revert "[v0.4.0] Temporarily remove Flax modules from the public API by @anton-l in #755)"
Fix push_to_hub for dreambooth and textual_inversion by @YaYaB in #748
Fix ONNX conversion script opset argument type by @justinchuby in #739
Add final latent slice checks to SD pipeline intermediate state tests by @jamestiotio in #731
fix(DDIM scheduler): use correct dtype for noise by @keturn in #742
[Tests] Fix tests by @patrickvonplaten in #774
debug an exception by @LowinLi in #638
Clean up resnet.py file by @natolambert in #780
add sigmoid betas by @natolambert in #777
[Low CPU memory] + device map by @patrickvonplaten in #772
Fix gradient checkpointing test by @patrickvonplaten in #797
fix typo docstring in unet2d by @natolambert in #798
DreamBooth DeepSpeed support for under 8 GB VRAM training by @Ttl in #735
support bf16 for stable diffusion by @patil-suraj in #792
stable diffusion fine-tuning by @patil-suraj in #356
Flax: Trickle down norm_num_groups by @akash5474 in #789
Eventually preserve this typo? :) by @spezialspezial in #804
Fix indentation in the code example by @osanseviero in #802
[Img2Img] Fix batch size mismatch prompts vs. init images by @patrickvonplaten in #793
Minor package fixes by @anton-l in #809
[Dummy imports] Better error message by @patrickvonplaten in #795
add or fix license formatting in models directory by @natolambert in #808
[train_text2image] Fix EMA and make it compatible with deepspeed. by @patil-suraj in #813
Fix fine-tuning compatibility with deepspeed by @pink-red in #816
Add diffusers version and pipeline class to the Hub UA by @anton-l in #814
[Flax] Add test by @patrickvonplaten in #824
update flax scheduler API by @patil-suraj in #822
Fix dreambooth loss type with prior_preservation and fp16 by @anton-l in #826
Fix type mismatch error, add tests for negative prompts by @anton-l in #823
Give more customizable options for safety checker by @patrickvonplaten in #815
Flax safety checker by @pcuenca in #825
Align PT and Flax API - allow loading checkpoint from PyTorch configs by @patrickvonplaten in #827

Contributors

kashif, keturn, and 16 other contributors

Assets 2

11 Oct 22:48

patrickvonplaten

v0.4.2

b2c9b54

v0.4.2: Patch release

This patch release allows the img2img pipeline to be run on fp16 and fixes a bug with the "mps" device.

Contributors

pcuenca and patil-suraj

Assets 2

Releases: huggingface/diffusers

v0.9.0: Stable Diffusion 2

🎨 Stable Diffusion 2 is here!

Installation

Stable Diffusion 2.0-V at 768x768

Stable Diffusion 2.0-base at 512x512

Stable Diffusion 2.0 for Inpanting

Stable Diffusion X4 Upscaler

Saving & Loading is fixed for Versatile Diffusion

📝 Changelog

Contributors

v0.8.1: Patch release

Contributors

v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model

🙆‍♀️ New Models

VersatileDiffusion

AltDiffusion

Stable Diffusion Image Variations

Safe Latent Diffusion

VQ-Diffusion with classifier-free sampling

LDM super resolution

CycleDiffusion

CLIPSeg + StableDiffusionInpainting.

K-Diffusion wrapper

🌀New SOTA Scheduler

🌐 Better scheduler API

🎉 Performance

🎁 Quality of Life improvements

📝 Changelog

Contributors

v0.7.2: Patch release

Contributors

v0.7.1: Patch release

Contributors

v0.7.0: Optimized for Apple Silicon, Improved Performance, Awesome Community

❤️ PyTorch + Accelerate

🍎 Apple Silicon support with PyTorch 1.13

Requirements

Memory efficient generation

Continuous Integration

💃 Dance Diffusion

🎉 Euler schedulers

🔥 Up to 2x faster inference with memory_efficient_attention

🚀 Much faster loading

🎨 RePaint

🌍 Community Pipelines

Long Prompt Weighting Stable Diffusion

Speech to Image

Wildcard Stable Diffusion

Composable Stable Diffusion

Imagic Stable Diffusion

Seed Resizing

📝 Changelog

Contributors

v0.6.0: Finetuned Stable Diffusion inpainting

🎨 Finetuned Stable Diffusion inpainting

🚀 ONNX pipelines for image2image and inpainting

🌍 Community Pipelines

Stable Diffusion Interpolation example

Stable Diffusion Interpolation Mega

📝 Changelog

Contributors

v0.5.1: Patch release

Contributors

v0.5.0: JAX/Flax and TPU support

🌾 JAX/Flax integration for super fast Stable Diffusion on TPUs.

🔥 DeepSpeed low-memory training

✏️ Changelog

Contributors

v0.4.2: Patch release

Contributors

Stable Diffusion 2.0-V at `768x768`

Stable Diffusion 2.0-base at `512x512`

🔥 Up to 2x faster inference with `memory_efficient_attention`