Skip to content

huynhspm/Generative-Model

Repository files navigation

Diffusion-Models

1. Introduction

Diffusion model is a type of generative model. Its approach is different from GAN, VAE and Flow-based models. In my repository, I re-setup diffusion model from scratch to do some experiments:

  • Diffusion Model: Training with simple loss
  • Inference with DDPM and DDIM
  • Using (label, image, text) as condition for diffusion model
  • Latent diffusion: Image space to latent space with VAE
  • Stable diffusion: Latent + Condition Diffusion
  • Classifier-free guidance
  • Sketch2Image: using condition as sketch image
  • Medical Image Segmentation: using condition as medical image

2. Set Up

Clone the repository

https://github.com/huynhspm/Generative-Model

Install environment packages

cd Generative-Model
conda create -n diffusion python=3.10
conda activate diffusion 
pip install -r requirements.txt

Training

set-up CUDA_VISIBLE_DEVICES and WANDB_API_KEY before training

export CUDA_VISIBLE_DEVICES=0
export WANDB_API_KEY=???

choose from available experiments in folder "configs/experiment" or create your experiment to suit your task.

# for generation task
python src/train.py experiment=generation/diffusion/train/mnist trainer.devices=1

# for reconstruction task
python src/train.py experiment=reconstruction/vq_vae/celeba trainer.devices=1

# for segmentation task
python src/train.py experiment=segmentation/condition_diffusion/train/lidc trainer.devices=1

Evaluation

set-up CUDA_VISIBLE_DEVICES and WANDB_API_KEY before evaluating

export CUDA_VISIBLE_DEVICES=0
export WANDB_API_KEY=???

choose from available experiments in folder "configs/experiment" or create your experiment to suit your task.

# for generation task
python src/eval.py experiment=generation/diffusion/eval/mnist trainer.devices=1

# for reconstruction task
...

# for segmentation task
python src/eval.py experiment=segmentation/condition_diffusion/eval/lidc trainer.devices=1

Inference

...

3. Diffusion Model

3.1. Dataset

3.2. Attention

  • Self Attention
  • Cross Attention
  • Spatial Transformer

3.3. Backbone

  • ResNet Block
  • VGG Block
  • DenseNet Block
  • Inception Block

3.4 Embedder

  • Time
  • Label: animal (dog, cat), number (0,1,...9), gender (male, female)
  • Image: Sketch2Image, Segmentation
  • Text: not implemented

3.5. Sampler

  • DDPM: Denoising Diffusion Probabilistic Models
  • DDIM: Denoising Diffusion Implicit Models

3.6. Model

  • Unet: Encoder, Decoder
  • Unconditional Diffusion Model
  • Conditional diffusion model (label, image, text - need to implement text embedder model)
  • Variational autoencoder: Vanilla (only work for reconstruction), VQ
  • Latent diffusion model
  • Stable diffusion model
  • Classifier-free; not work

4. RESULTS

4.1. Unconditional Diffusion

Dataset Image-Size FID (features=2048, ddim -> ddpm) Config
Mnist 32x32 2.65 -> 0.89 Train, Eval
Fashion-Mnist 32x32 3.31 -> 2.42 Train, Eval
Cifar10 32x32 5.54 -> 3.58 Train, Eval

Mnist Generation Fashion Generation Cifar10 Generation

4.2. Conditional Diffusion

Dataset Image-Size FID (features=2048, ddim -> ddpm) Config
Mnist 32x32 3.91 -> 1.16 Train, Eval
Fashion-Mnist 32x32 3.10 -> 2.15 Train, Eval
Cifar10 32x32 5.66 -> 3.37 Train, Eval
Gender 64x64 3. Train, Eval
CelebA 64x64 3. Train, Eval

Mnist Generation Fashion GenerationCifar10 Generation

Male Generation Female Generation

  • Sketch2Image (Sketch, Fake, Real)

AFHQ Sketch AFHQ Fake AFHQ Real

4.3 DDPM and DDIM

DDPM (64x64)

DDPM Generation

DDIM (64x64)

DDIM Generation

4.4 DIFFUSION INTERPOLATION (64x64)

Interpolation Generation

4.5 VAE RECONSTRUCTION

CIFAR10

Cifar10 Reconstruction

AFHQ

AFHQ Reconstruction

GENDER

Gender Reconstruction

CELEBA

Celeba Reconstruction

4.5 VAE INTERPOLATION

CIFAR10 (32x32)

Cifar10 Interpolation

AFHQ (64x64)

AFHQ Interpolation

CELEBA (128x128)

Celeba Interpolation

4.6 Latent Diffusion

GENDER (128x128)

Gender Generation

AFHQ (256x256)

AFHQ Generation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages