🔥 🔥 voltaML-fast-stable-diffusion webUI 🔥 🔥

Lightweight library to accelerate Stable-Diffusion, Dreambooth into fastest inference models with WebUI single click or single line of code.

Setup webUI

Docker setup (if required)

Setup docker on Ubuntu using these intructions.

Setup docker on Windows using these intructions

Launch voltaML container

Download the docker-compose.yml file from this repo.

⚠️ Linux: Open it in a text editor and change the path of the output folder. It was configured for Windows only.

output:
  driver: local
  driver_opts:
    type: none
    device: C:\voltaml\output # this line
    o: bind

Then, open a terminal in that folder and run the following command

Linux

sudo docker-compose up

Windows

docker-compose up

How to use webUI

Once you launch the container, a flask app will run and copy/paste the url to run the webUI on your local host.
There are two backends to run the SD on, PyTorch and TensorRT (fastest version by NVIDIA).
To run on PyTorch inference, you have to select the model, the model will be downloaded (which will take a few mins) into the container and the inference will be displayed. Downloaded models will be shown as below
To run TensoRT inference, go to the Accelerate tab, pick a model from our model hub and click on the accelerate button.
Once acceleration is done, the model will show up in your TensorRT drop down menu.
Switch your backend to TensorRT, select the model and enjoy the fastest outputs 🚀🚀

Benchmark

The below benchmarks have been done for generating a 512x512 image, batch size 1 for 50 iterations.

Model	T4 (it/s)	A10 (it/s)	A100 (it/s)	4090 (it/s)	3090 (it/s)	2080Ti (it/s)
PyTorch	4.3	8.8	15.1	19	11	8
Flash attention xformers	5.5	15.6	27.5	28	15.7	N/A
AITemplate	Not supported	26.7	55	60	N/A	Not supported
VoltaML(TRT-Flash)	11.4	29.2	62.8	85	44.7	26.2

⚠️ ‼️ Warnings/Caveats

This is v0.1 of the product. Things might break. A lot of improvements are on the way, so please bear with us.

This will only work for NVIDIA GPUs with compute capability > 7.5.
Cards with less than 12GB VRAM will have issues with acceleration, due to high memory required for the conversions. We're working on resolving these in our next release.
While the model is accelerating, no other functionality will work since the GPU will be fully occupied

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
.idea		.idea
output		output
static		static
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
License		License
README.md		README.md
app.py		app.py
build.sh		build.sh
docker-compose.yml		docker-compose.yml
models.py		models.py
pytorch_model.py		pytorch_model.py
requirements.txt		requirements.txt
start.sh		start.sh
utilities.py		utilities.py
volta_accelerate.py		volta_accelerate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔥 🔥 voltaML-fast-stable-diffusion webUI 🔥 🔥

Setup webUI

Docker setup (if required)

Launch voltaML container

Linux

Windows

How to use webUI

Benchmark

⚠️ ‼️ Warnings/Caveats

About

Releases

Packages

Languages

License

axel-havard/voltaML-fast-stable-diffusion

Folders and files

Latest commit

History

Repository files navigation

🔥 🔥 voltaML-fast-stable-diffusion webUI 🔥 🔥

Setup webUI

Docker setup (if required)

Launch voltaML container

Linux

Windows

How to use webUI

Benchmark

⚠️ ‼️ Warnings/Caveats

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages