A cat in Text-to-image Framework the snow Generator A cat in Text the snow Encoder Generation Model Decoder 3
Framework Text-to-image Generator A cat in the snow A cat in the snow Text Encoder Generation Model Decoder 1 3 2
T5-Small 300M 25 T-Large 25 500M T5-XL 1B T5-XXL 2B XOI-CIH 20 XOI-CI 20 15 15 10 10 0.22 0.24 0.26 0.28 0.24 0.250.26 0.270.280.29 CLIP Score CLIP Score (a)Impact of encoder size. (b)Impact of U-Net size. https://arxiv.org/abs/2205.11487
https://arxiv.org/abs/2205.11487