this post was submitted on 10 Apr 2024
1 points (100.0% liked)

StableDiffusion

108 readers
2 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS
 
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/trakusmk on 2024-04-10 05:35:55.


Hey everyone, I’ve been following the development of text-to-image models and noticed something interesting. A lot of the new models and papers like Stable Diffusion 3, PixArt Sigma, and Ella-diffusion are using the FLAN-T5 model for text encoding. Considering there are bigger models out there like Mistral 7B with 7 billion parameters or even llama 70b that have much greater language understanding, I’m curious why researchers are sticking with a smaller outdated model like FLAN-T5. Any thoughts on why this might be the case?

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here