So I've been wanting to share something useful back to the community here for a little while and hopefully open up some of the possibilities with Stable Diffusion. I realise Stable Diffusion 3 is coming, some day... but I think I've managed to get a decent workflow going in the last few days that has excellent prompt adherence (for SDXL).
The first thing to share here is an SDXL Lora file (which will be needed later in this process). This one has been entirely created on my own AI generated images from previous LoRA I created. I am happy to share this one as the training material contained none of my original shot images and as such has no copyrighted material and no way to make images that resemble the original models. You can of course just use this with standard SDXL and make some gunge images. The LoRA was trained on around 60 images with GPT4v captioning that was then edited. It will lean more towards Asian women but can be prompted out of that. You should use the keyword 'gunge' in your prompt if using this and can use other variations like 'covered in gunge' 'gunge pouring on her head' 'gunge on her body', etc.. This won't give great prompt adherence if used alone so read on for something cool...
The next piece of this is pretty great. This goes full nerdy, so unless you are looking to move into doing SDXL generation then this might not mean much.
So, I've managed to get a workflow going in ComfyUI that uses the amazing prompt adherence of PixelArt Sigma combined with Perturbed Attention Guidance (PAG) and SDXL. This is based off the original work of the Abominable Spaghetti Workflow with a switch from 1.5 to SDXL and integrated a PAG model, gunge LoRA and face detailer.
The upshot is that the workflow can stick pretty close to the prompt and create some cool images. You can download the ComfyUI workflow here: DOWNLOAD WORKFLOW BY RIGHT CLICK AND SAVE. Inside the ComfyUI workflow is a readme box once imported that explains all the parts required to download to make this work. Import this into ComfyUI. If you have not used ComfyUI before then I suggest doing a little reading or YouTube watching first. This will require quite a bit of VRAM and an Nvidia GPU, I'd recommend a 16GB card but you might just squeeze it into 12GB.
As this is just newly built, this is not what I have been using to previously generate my images but will certainly be a new addition to that for some scenes. Below are some examples of prompts and the generated images to show the prompt adherence. These are directly out of the workflow and have had no touching up or enhancement in any way. The prompts were written with GPT-4 assistance. Note the word 'gunge' in the prompts to trigger the LoRA. **Full size images are attached to the post at the bottom**
************
a close-up image of a woman in a dynamic, expressive pose within an artistic setting. She is engulfed by thick, viscous gunge in vivid yellow and pink, pouring from above and completely covering her. The lighting is dramatic and dynamic, with sharp contrasts--highlighting the bright colors of the gunge against shadows and illuminated areas around her. The background, though blurred, hints at large abstract paintings and scattered art supplies, enhancing the creative and chaotic ambiance. The woman's expression captures joy and surprise, her arms slightly raised, engaging vividly with the playful mess. The lighting accentuates the textures of the gunge and the emotional impact of the scene
a brightly lit, colorful scene on a whimsical Japanese TV game show set, where an attractive young Asian woman with a cheerful expression is navigating a cute and vibrant obstacle course. The entire scene is styled like a cartoonish playground, with soft, rounded obstacles in pastel colors, each generously splattered with neon green gunge. The contestant, wearing a playful and colorful outfit adorned with kawaii (cute) icons and patterns, is covered in globs of shiny, neon green gunge that adds a fun and messy twist to her appearance. The background is filled with a lively, animated audience, their reactions exaggerated and comical, with some audience members also playfully splashed with green gunge. Above, cheerful banners and balloons add to the festive atmosphere, while digital scoreboards display cute Japanese characters and animations. The overall tone is light-hearted and fun, capturing the essence of a cutesy, energetic Japanese game show, complete with all the whimsy and laughter such a setting implies
a close-up, vibrant scene inside a colorful ice cream parlor, where a young woman is humorously struggling with a wildly malfunctioning ice cream machine. This lively machine, adorned with bright, cheerful colors and playful designs, has taken on a life of its own, vigorously ejecting ice cream gunge. The woman, captured in a medium close-up, is clad in a charming, retro-style uniform complete with an apron and a whimsical hat. Despite her best efforts to control the situation using the machine's levers and buttons, she is completely drenched in a delightful mess of ice cream gunge. The thick, colorful gunge covers her from head to toe in shades of pastel pink, mint green, and baby blue, sticking to her clothes and hair, creating a comically chaotic look. The ice cream machine, central in the frame, continues to churn out more ice cream gunge, emphasizing its uncontrollable nature. Her facial expressions, a mix of astonishment and laughter, along with her futile attempts to shield herself with a large spoon, add to the humor and dynamism of the scene. The focus remains tight on her and the machine, capturing every detail of the ice cream gunge as it adds more layers to her already colorful coating, encapsulating the playful disaster unfolding in this whimsical ice cream parlor
a bustling fairground, filled with the vibrant sounds of carnival music and bursts of laughter. At the center of this festive scene, a young woman sits on a stool, eagerly participating in a popular fairground game. The woman is immediately splashed with thick, colorful gunge--neon green, bright blue, and hot pink, enthusiastic fairgoers surround her. Dressed casually in a t-shirt and shorts, and with a wide smile, she becomes the centerpiece of this lively event. Positioned in front of a large, colorfully decorated backdrop that enhances the carnival atmosphere, each successive splash adds more vibrant layers of gunge, which amusingly drip down the backdrop and pool around her on the platform. Surrounded by ongoing cheers and laughter, she revels in the messy fun.
In an intimate atmosphere of a hotel room, an attractive nude couple finds closeness laying down on a sleek leather sofa. The man, with a gleam of mischief in his eyes, smears vibrant green gunge all over the woman's body. The thick, green gunge flows down her hair and drapes over her shoulders and exposed breasts, stark against the dark leather of the sofa. She reacts with a groan of ecstasy, her eyes alight with surprise and delight. This messy interaction, filled with a seductive feeling and spontaneous fun, captures an unconventional and joyful form of intimacy. The scene is set against the chic backdrop of the hotel room, the glossy leather sofa adding a touch of elegance to their playful escapade.
In the dimming light of the enchanted forest, a princess in a radiant blue ball gown is on her way back to her castle when she suddenly stumbles into an unseen mudpit. This close-up shot captures her from just above the waist, emphasizing the stark transformation caused by the thick, enveloping mud gunge. The mud gunge clings to her gown and skin, obscuring the vibrant blue with its dark, heavy texture. It splashes up to her neckline, smearing across her elegant collar and staining the delicate fabric. Her face is speckled with mud gunge, highlighting her startled yet resilient expression. Her eyes, wide with surprise, reflect a spark of resolve against the muddy adversity. This shot brings the viewer directly into her moment of disarray, surrounded by the shadowy greens of the forest's twilight.
Hope some of you might get some use out of this and can hopefully improve upon it and share back any of this. If you do get it going I'd love to see what you create.
Absolutely fantastic results and although i also don't understand this beyond playing with basic prompts can appreciate how much time youve given this and to be willing to gift it to the community is worthy of Hero status in my eyes!!!
wormwam said: Absolutely fantastic results and although i also don't understand this beyond playing with basic prompts can appreciate how much time youve given this and to be willing to gift it to the community is worthy of Hero status in my eyes!!!
Thank you. Really hoping we can build up some home grown community tools and knowledge in how to get the best out of AI when it comes to WAM. Also its important we respect the copyright of hardworking producers to keep sentiment towards AI as positive.
took a go at some pie pics using this workflow and LoRA. This file is not trained on any pie pics but it still managed to do a halfway decent job. I'd love to do a pie specific LoRA at some point.
Here is the prompt:
an image of a comedic TV gameshow scene where a woman, with a joyful and surprised expression, is sitting on a bright, colorful stage set. She is seated on a simple, modern chair at the center of the stage. The woman is being playfully hit in the face with an assortment of cream pies, each cream pie bursting and splattering on her face with whipped cream gunge in a visually satisfying way. Emphasize the splatter of the whipped cream gunge and the fragments of pie crust flying on her face. The whipped cream gunge from the pies is all over her face and body, adding vibrant contrast against her casual, dark-colored outfit. The background should include a cheering audience, bright stage lights, and a large, flashy gameshow sign. The overall atmosphere is lively and humorous, with a focus on the whimsical chaos of whipped cream gunge and pie crust debris on her face
These images are again straight out of the generator with no enhancement or touch up.
tried out a selection of other relatively simple prompts, mostly portrait shots trying out different scenes and mess. some were given a more artsy styling to see what came out.
This is fantastic! Great work on the workflow, and thanks so much for the LoRA as well! Very impressive stuff. Need to work out if I can make XL work on my remote setup now. Been avoiding it until now due to the higher compute costs, but this might make it worth it! Out of interest, do you find it works with the more anime-style checkpoints as well? I know some of your images are that way, and I sometimes find them preferable to the photorealism in uncanny-valley terms.
MMasia said: tried out a selection of other relatively simple prompts, mostly portrait shots trying out different scenes and mess. some were given a more artsy styling to see what came out.
Hey! I'm nearly there with it on my cloud service. Managed to get comfy running ok, downloaded all the various bits linked in the readme of your workflow, and installed most of the remaining missing modules with comfyUI Manager. Might be worth updating the readme with this repo too: https://github.com/pythongosssss/ComfyUI-Custom-Scripts As it's not in Manager, and so needs to be added manually for the custom LoraLoader.
Then I was wondering if I could ask for a bit of tech support on the following?
I'm really struggling with the T5 loader. There wasn't a /t5/ directory in my models dir to begin with, so I created the dir and downloaded the following files:
root@n27glgpyrb:/notebooks/ComfyUI/models/t5# du -ah 1.0K ./config.json 8.5G ./model-00002-of-00002.safetensors 9.4G ./model-00001-of-00002.safetensors
I get the following error from the T5 loader though when trying to run the workflow:
!!! Exception during processing!!! Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /notebooks/ComfyUI/models/t5.
Now I don't have any files like that in the dir, because I just created it, and yet the t5 loader module isn't showing as missing or anything. Feels like I've accidentally swerved an install step here, but not sure what! Found an equivalent thread from people using BERT, where the fix was it was expecting a different file compression to what it was pointed to, but assume that's not relevant to this?
Its another bit of my bad documentation. You will need to add ComfyUI_ExtraModels from comfyui manager to sort the T5 thing. My bad. It should be in manager and listed as being from city96. This is it https://github.com/city96/ComfyUI_ExtraModels but easier to just add from comfyui manager.
Its another bit of my bad documentation. You will need to add ComfyUI_ExtraModels from comfyui manager to sort the T5 thing. My bad. It should be in manager and listed as being from city96. This is it https://github.com/city96/ComfyUI_ExtraModels but easier to just add from comfyui manager.
Hope that helps
Thanks! much appreciated.Will give it a crack this weekend. Sounds like that'll be the fix though
Any chance you resolved the !!! Exception during processing!!! Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory error?
I seem to be struggling with the same issue even after installing the extra models with the manager