So as I mentioned in my first post I quickly got bored of "safe" image generators, I like my WAM with nudity ... and spent a while figuring out how to generate NSFW images.
As you may have guessed from the title, the answer was Stable Diffusion, a family of models that you can freely download and run locally. And, one particular website that I highly recommend: https://civitai.com/
It's a community where AI models are developed and shared, with permissive but sensible policies, similar to the policies here. With a free account you can do a lot, a few features are for-pay, I have used some of them and I did not detect any kind of subscription or other scam.
The SDXL model series generates high resolution (1024x1024) images and is not censored; the base model I'm using right now is a variant "DreamShaper XL" which you can choose from the "model" search on civitai's "generate" page.
I attach an example photo generated this model and the prompt "full length photo of woman standing nude, slimed in pink slime, white background"; it took about 10 tries to get one this good. If you're interested you should be able to get results like this directly on civitai.com with a free account, the only settings you need are the model and the prompt.
There is quite a long way from here to the content I'm producing for messy-ai ...
The first big thing is that I do run the model locally. The models are free to download from civitai and I use a free+open source tool "automatic1111" to run them. You need a reasonably powerful graphics card, I have an AMD 6800 XT 16GB.
The reason this is needed is the stable diffusion models are quite "hit and miss", particularly on, ehm, "specialist" content. So I generate hundreds/thousands of images and pick the good ones. "automatic1111" has a JSON API you can script against, I use this with some custom scripts to vary the prompt and gallery code to pick images.
The second big thing I'm doing is using LoRA models on top of the base SDXL model. LoRA models are "add on" models that you train on a small set of images+descriptions, then apply like a prompt to add a particular style / concept / object; here is a guide:
https://aituts.com/stable-diffusion-lora/
When you apply a LoRA you choose how much weight to apply it with, and this interacts with the prompt. So for example if a LoRA is "about" gunge tanks then when applied with low weight it probably does nothing unless the prompt was already going to lead to a picture of a gunge tank ... but if you crank up the weight it will eventually force whatever it was you were generating to turn into a gunge tank regardless of the prompt
Currently I have 10+ LoRAs that I mix and match, and I expect to have more--and here they lead to a real headache, it's no longer just about the prompt but also the mix of LoRAs; so you might want to generate a few thousand extra images exploring the different variations. Which is what I've been doing today to try to come up with the perfect image for one particular moment in the fourth game.
It's certainly the case that AI is giving us a new way of generating content, but so far it does not seem to be remarkably low effort for the results produced
Certainly stable diffusion with custom lora is the way to go for NSFW content. Here's a few quick generations using a similar prompt with pink slime. These were a few picks from about 80 images, probably around 60 are usable from that set. If you get the lora trained well then they shouldn't give you a high number of waste.
Of course, one point where AI has a huge advantage over the real world is the cost of backgrounds, props and cleanup--see attached for what would be a painful cleanup.
That only fully applies to one-off images, though. So far messy-ai.org sets are mostly prop-free with white backgrounds because it's additional work to keep any objects the same across all images in the series, making the AI props ... ironically ... "expensive" in terms of compute and human time.
Were you able to find LoRAs trained for WAM? Or did you train them yourself? I'm also working with an AMD card (6900 XT), and am struggling to attempt to train LoRAs with it. Most instructions are geared towards Nvidia cards, and the handful I've found seem to require Linux. I managed to get Linux running in a VM, but running into a lot of issues there since I'm not really a Linux user.
179jfkla said: Were you able to find LoRAs trained for WAM? Or did you train them yourself? I'm also working with an AMD card (6900 XT), and am struggling to attempt to train LoRAs with it. Most instructions are geared towards Nvidia cards, and the handful I've found seem to require Linux. I managed to get Linux running in a VM, but running into a lot of issues there since I'm not really a Linux user.
Yeah, if you want to do stable diffusion properly on AMD you will need a native Linux install (not in a VM). Running in Windows you will be getting about a quarter of the performance for generations on AMD and training a lora will not go so well. In Linux it will all work OK but require some extra hurdles still for training. I had a 7900XTX for a while and eventually sold it for an RTX 4080 and everything is now great in Windows without issue.
It's a shame as the rocm stuff and larger vram sizes should make AMD competitive but it's just not supported by most AI tools particularly well and after over a year the rocm for Windows implementation still hasn't made it through all the needed components for a proper SD experience.
For now though I'm using civitai to train, it's one of their paid features: 50 cents per LoRA. You upload a zip file with images + text files with descriptions, it trains for about an hour then presents ten versions of the LoRA at different training levels to download. They encourage you to publish the LoRA but it's not required, I didn't publish anything.
Re: performance, the SDXL turbo models range of models made a huge difference, they stabilize much more quickly, i.e. need fewer iterations. Before the turbo models I was mostly generating with 20 iterations, the base SDXL turbo model works fine with 1, the "DreamShaper XL" model that I'm currently using seems to work best with 4 iterations --> about 4.5 seconds for a full image generation on my setup. This works well for exploratory generation, or I can do a batch run over some hours to generate many thousands of images to give a high chance of finding what I want.
For now though I'm using civitai to train, it's one of their paid features: 50 cents per LoRA. You upload a zip file with images + text files with descriptions, it trains for about an hour then presents ten versions of the LoRA at different training levels to download. They encourage you to publish the LoRA but it's not required, I didn't publish anything.
Re: performance, the SDXL turbo models range of models made a huge difference, they stabilize much more quickly, i.e. need fewer iterations. Before the turbo models I was mostly generating with 20 iterations, the base SDXL turbo model works fine with 1, the "DreamShaper XL" model that I'm currently using seems to work best with 4 iterations --> about 4.5 seconds for a full image generation on my setup. This works well for exploratory generation, or I can do a batch run over some hours to generate many thousands of images to give a high chance of finding what I want.
I assume you're in Linux for the actual stable diffusion part. The lora guide looks OK but is only a guide for Linux. Kohya_ss hasn't made it to Windows for AMD rocm yet.
Before seeing this thread. Let me know if you have any questions about training locally. I finetuned the full SDXL Base Model locally using Kohya with full fp16 training to fit on my 4090 and then extracted the LoRA from the final model.
While I personally train on Linux because it's just better for training machine learning models, Kohya definitely does work on Windows and I have used it as such:
If you do not own your own 3090 or 4090, that guide will explain how to provision cloud hardware which you can use to train a model. Do note though that 24GB is sufficient when using full fp16 training, so you don't necessarily need to spring for a 48GB card.