A basic diffusion model from scratch - Denoising Diffusion Probabilistic Models (DDPM) pipeline. Build on top of https://github.com/cloneofsimo/minDiffusion/tree/master. Note it supports both DDPM with and without additional conditions (e.g. text information).
- Create a conda environment.
conda create --name diffusion python=3.9 -y
conda activate diffusion
- Install PyTorch.
pip install torch torchvision torchaudio
- Install additional libraries.
pip install tqdm openai-clip
- CIFAR10
No preparation needed.
- Customized dataset
Download your own images into ./data
and write your customized data loader.
Run the below commad. You can specify different DDPM models (DDPM with NaiveUnet, DDPM with ContextUnet) and hyperparamters inside.
python train.py
Run the below command. You can specify different models (NaiveUnet, ContextUnet) and hyperparamters inside.
python inference.py
- DDPM 1000 steps CIFAR10 without conditions and trained with 100 epochs.
- DDPM 1000 steps CIFAR10 with one-hot encoding class conditions and trained with 100 epochs. The first to last row is conditioned on 'automobile', 'cat', 'dog', and 'ship', respectively.
- DDPM 1000 steps CIFAR10 with text embedding class conditions and trained with 100 epochs. The first to last row is conditioned on 'automobile', 'cat', 'dog', and 'ship', respectively.
- DDPM without conditions (NaiveUnet) trained with 100 epochs on CIFAR10.
https://drive.google.com/file/d/1e95Rkgb1DtvFPuyynMipOPprMcgrp99j/view?usp=sharing
- DDPM with one-hot encoding class conditions (ContextUnet) and trained with 100 epochs on CIFAR10.
https://drive.google.com/file/d/1tK6a1mOisSM-holI8Ou3AFlX5MTzCr7i/view?usp=sharing
- DDPM with text embedding class conditions (ContextUnet) and trained with 100 epochs on CIFAR10.
https://drive.google.com/file/d/1OZwl_cjNqretPH-Azj3sMoYC4jFNt_Jc/view?usp=sharing