You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking at the current implementation, and was noticing that before every generation you pass all reference images through the VAE as one batch. After a certain amount of references images, that would result in huge amount of VRAM needed I believe.
Wouldn't it be better to get the latents for each selected image beforehand, store them either in the RAM or on the drive temporarly, then load them on generation?
That way you avoid big batch in the VAE, and you compute the latents only once for a given reference image instead of for each generation.
Hi,
I was looking at the current implementation, and was noticing that before every generation you pass all reference images through the VAE as one batch. After a certain amount of references images, that would result in huge amount of VRAM needed I believe.
Wouldn't it be better to get the latents for each selected image beforehand, store them either in the RAM or on the drive temporarly, then load them on generation?
That way you avoid big batch in the VAE, and you compute the latents only once for a given reference image instead of for each generation.
What do you think ?
https://github.com/sd-fabric/fabric/blob/caaa5831bacefb060d46168372b45e3bac84a3ae/fabric/generator.py#L357C1-L373C14
The text was updated successfully, but these errors were encountered: