StableDiffusion

108 readers

2 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago

MODERATORS

[email protected]

Why Do Cutting-Edge Diffusion Models Favor FLAN-T5 Over Larger Models Like Mistral 7B? (lemmit.online)

submitted 7 months ago by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/trakusmk on 2024-04-10 05:35:55.

Hey everyone, I’ve been following the development of text-to-image models and noticed something interesting. A lot of the new models and papers like Stable Diffusion 3, PixArt Sigma, and Ella-diffusion are using the FLAN-T5 model for text encoding. Considering there are bigger models out there like Mistral 7B with 7 billion parameters or even llama 70b that have much greater language understanding, I’m curious why researchers are sticking with a smaller outdated model like FLAN-T5. Any thoughts on why this might be the case?

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here