1
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/trakusmk on 2024-04-10 05:35:55.


Hey everyone, I’ve been following the development of text-to-image models and noticed something interesting. A lot of the new models and papers like Stable Diffusion 3, PixArt Sigma, and Ella-diffusion are using the FLAN-T5 model for text encoding. Considering there are bigger models out there like Mistral 7B with 7 billion parameters or even llama 70b that have much greater language understanding, I’m curious why researchers are sticking with a smaller outdated model like FLAN-T5. Any thoughts on why this might be the case?

2
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/OnefunnyMoFo on 2024-04-09 19:12:27.

3
1
Neo Feudal Japan (i.redd.it)
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Wrong-Analyst-4399 on 2024-04-09 20:32:31.

4
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Wrong-Analyst-4399 on 2024-04-09 19:53:44.

5
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Theblasian35 on 2024-04-09 18:57:38.

6
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/unodewae on 2024-04-09 10:34:42.

7
0
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/OnefunnyMoFo on 2024-04-09 18:12:24.

8
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Yaaasbetch on 2024-04-09 09:33:54.

9
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/whimsical_sarah on 2024-04-09 16:41:27.

10
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/MaxwellTechnology on 2024-04-09 14:25:41.


Ella weights got released for SD 1.5 with inference code. Disclaimer: I am not the author.

11
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ryanrybot on 2024-04-09 08:39:44.

12
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/icchansan on 2024-04-09 13:44:13.

13
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Temporary-Data9219 on 2024-04-09 13:23:39.

14
1
Crayon monster (i.redd.it)
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/BlueeWaater on 2024-04-09 13:14:01.

15
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/GianoBifronte on 2024-04-09 11:36:35.

Original Title: Release: AP Workflow 9.0 for ComfyUI - Now featuring SUPIR next-gen upscaler, IPAdapter Plus v2 nodes, a brand new Prompt Enricher, Dall-E 3 image generation, an advanced XYZ Plot, 2 types of automatic image selectors, and the capability to automatically generate captions for an image directory


AP Workflow 9.0 for ComfyUI

So. I originally wanted to release 9.0 with support for the new Stable Diffusion 3, but it was way too optimistic. While waiting for it, as always, the amount of new features and changes snowballed to the point that I must release it as is.

Support for SD3 will arrive with the AP Workflow 10.

The new Early Access program I created for APW 9.0 was successful, so I'll continue to provide access to APW 10 early access builds via Discord, where I provide *limited and not guaranteed* support (but people seem happy with the speed and quality of the help I offered so far).

New features

  • The AP Workflow now features two next-gen upscalers: CCSR, and the new SUPIR. Since one performs better than the other depending on the type of image you want to upscale, each one has a dedicated function. Additionally, the Upscaler (SUPIR) function can be used to perform Magnific AI-style creative upscaling.
  • A new Image Generator (Dall-E) function allows you to generate an image with OpenAI Dall-E 3 instead of Stable Diffusion. This function should be used in conjunction with the Inpainter without Mask function to take advantage of Dall-E 3 superior capability to follow the user prompt and Stable Diffusion superior ecosystem of fine-tunes and LoRAs. You can also use this function in conjunction with the Image Generator (SD) function to simply compare how each model renders the same prompt.
  • A new Advanced XYZ Plot function allows you to study the effect of ANY parameter change in ANY node inside the AP Workflow.
  • A new Face Cloner function uses the InstantID technique to quickly change the style of any face in a Reference Image you upload via the Uploader function.
  • A new Face Analyzer function allows you to evaluate a batch of generated images and automatically choose the ones that present facial landmarks very similar to the ones in a reference image you upload via the Uploader function. This function is especially useful in conjuction with the new Face Cloner function.
  • A new Training Helper for Caption Generator function will allow you to use the Caption Generator function to automatically caption hundreds or thousands of images in a batch directory. This is useful for model training purposes. The Uploader function has a new Load Image Batch node to accomodate this new feature. To use this new capability you must activate both the Caption Generator and the Training Helper for Caption Generator functions in the Controller function.
  • The AP Workflow now features a number of u/rgthree Bookmark nodes to quickly recenter the workflow on the 10 most used functions. You can move the Bookmark nodes where you prefer to customize your hyperjumps.
  • The AP Workflow now supports new u/cubiq’s IPAdapter plus v2 nodes.
  • The AP Workflow now supports the new PickScore nodes, used in the Aesthetic Score Predictor function.
  • The Uploader function now allows you to upload both a source image and a reference image. The latter is used by the Face Cloner, the Face Swapper, and the IPAdapter functions.
  • The Caption Generator function now offers the possibility to replace the user prompt with a caption automatically generated by Moondream v1 or v2 (local inference), GPT-4V (remote inference via OpenAI API), or LLaVA (local inference via LM Studio).
  • The three Image Evaluators in the AP Workflow are now daisy chained for sophisticated image selection. First, the Face Analyzer (see below) automatically chooses the image/s with the face that most closely resembles the original. From there, the Aesthetic Score Predictor further ranks the quality of the images and automatically chooses the ones that match your criteria. Finally, the Image Chooser allows you to manually decide which image to further process via the image manipulator functions in the L2 of the pipeline. You have the choice to use only one of these Image Evaluators, or any combination of them, by enabling each one in the Controller function.
  • The Prompt Enricher function has been greatly simplified and now it works again open access models served by LM Studio, Oobabooga, etc. thanks to u/glibsonoran’s new Advanced Prompt Enhancer node.
  • The Image Chooser function now can be activated from the Controller function with a dedicated switch, so you don’t have to navigate the workflow just to enable it.
  • The LoRA Info node is now relocated inside the Prompt Builder function.
  • The configuration parameters of various nodes in the Face Detailer function have been modified to (hopefully) produce much better results.
  • The entire L2 pipeline layout has been reorganized so that each function can be muted instead of bypassed.
  • The ReVision function is gone. Probably, nobody was using it.
  • The Image Enhancer function is gone, too. You can obtain a creative upscaling of equal or better quality by reducing the strength of ControlNet in the SUPIR node.
  • The StyleAligned function is gone, too. IPAdapter has become so powerful that there’s no need for it anymore.

You can download the AP Workflow 9.0 for ComfyUI here:

Workshops

Companies and education institutions have started asking for in-person workshops to master the AP Workflow and the infinite possibilities offered by Stable Diffusion + ComfyUI.

Videos are great (and I'm thinking about doing them), but they can't possibly replace the direct interaction to solve specific challenges that are unique to you.

If are interested in that, reach out.

Special Thanks

The AP Workflow wouldn't exist without the incredible work done by all the node authors out there. For the AP Workflow 9.0, I worked closely with u/Kijai, u/glibsonoran, u/tzwm, and u/rgthree, to test new nodes, optimize parameters (don't ask me about SUPIR), develop new features, and correct bugs.

These people are exceptional. They went above and beyond to steer their work in a direction that would help me and facilitate the inclusion in the AP Workflow. If you are hiring, hire them.

And, of course, on top of them, there are the dozens of other node authors who created all the nodes powering the AP Workflow. Thank you all!

16
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/PicassoPix on 2024-04-09 11:14:00.

17
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Wllknt on 2024-04-09 07:15:20.

18
1
Templar knight (i.redd.it)
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/anjo08 on 2024-04-09 09:58:58.

19
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Xerophayze on 2024-04-09 06:05:37.

20
1
Missing Woman (www.reddit.com)
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Embarrassed_War_6363 on 2024-04-09 05:34:09.

21
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/StableLlama on 2024-04-08 23:49:45.


Just a few hours old and not mentioned here:

Cos Stable Diffusion XL 1.0 and Cos Stable Diffusion XL 1.0 Edit

Cos Stable Diffusion XL 1.0 Base is tuned to use a Cosine-Continuous EDM VPred schedule. The most notable feature of this schedule change is its capacity to produce the full color range from pitch black to pure white, alongside more subtle improvements to the model's rate-of-change to images across each step.

Edit Stable Diffusion XL 1.0 Base is tuned to use a Cosine-Continuous EDM VPred schedule, and then upgraded to perform instructed image editing. This model takes a source image as input alongside a prompt, and interprets the prompt as an instruction for how to alter the image.

22
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/aibot-420 on 2024-04-08 21:22:02.

23
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Boxxygen on 2024-04-08 18:53:10.

24
1
Desert Pane (i.redd.it)
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/Wrong-Analyst-4399 on 2024-04-08 18:01:33.

25
1
submitted 5 months ago by [email protected] to c/[email protected]
This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/ninjasaid13 on 2024-04-08 23:49:41.

view more: next ›

StableDiffusion

108 readers
2 users here now

/r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and...

founded 1 year ago
MODERATORS