I have a business colleague who is very interested in AI and pointed me at one of the current front runners. While clearly not there yet, some of these are pretty damn good for a bunch of Python code and the existing Internet to refer to. In particular, the woman in denim dungarees in the mud looks almost completely realistic.
vanpigboy said: Looking good, what AI did you use for those ones?
Stable Diffusion.
And through it I can now display one of Saturation Hall's greatest treasures, the two-hundred-year-old oil painting "Woman in a Boilersuit" by Constable, painted in 1822, just after his famous "The Hay Wain".
Plus a snapshot from a future fantasy mud event in the gardens.
Did a bunch more tests, asking for rear views to avoid the bad faces problem. Definitely not quite there, these aren't going to be replacing modelled images yet. With the emphasis being on the yet - given this technology didn't even exist a few years ago, you have to wonder how far and fast it will develop from here.
A few notes on the experiments: 1. It's quite good at drawing women in boilersuits, less so at some other styles. 2. It doesn't seem to properly understand terms like "waist deep" 3. While it is quite good at water and mud, anything else like "bath of beans" seems to baffle it. 4. These notes are my experience of it, others may get very different results. 5. Sometmes you get better results if you ask for "in the style of" and specify a classical painter like Donatello, rather than just asking for a photograph.
It's absolutely fascinating to play with in general, I've only posted WAM images here but have had some amazing results with various other prompts.
Plonk said: Sheeeeit, that's interesting. Who'd have thought one of the first victims of the AI revolution would be the niche wam producer?
To be honest, I don't think these are a threat to actual human modelled WAM yet, first off they're only still images, when the market wants full-motion video, and there are still too many details that just aren't quite right when you look at the full size images. Not to mention the total nightmare fuel it still tends to make of human faces, which is why all of these are back views. Definitely a fascinating field of research though, and in ten years time, who knows what will be possible?
Here is some image from my work with AI. I used MidJourney. All the picture is created by prompt command, not from base-pic url.
I really love the floaty, ethereal mud splashes it draws in these- unnatural, but stylistically relevant. It's like a frame from a dream- Which I guess says a lot for the AI and something significant about our own brains.
Been experimenting some more with the latest release, it's a major step forwards.
Still far from perfect and the real skill is in "prompt engineering". But as before, impressive results. Weirtdly, I'd been trying for days to get it to draw people "waist deep" in stuff, which it totally ignored. Then I asked it for people lying down and got them standing waist-deep in mud-pools.
It's still early days. I feel we're at the equivalent of the dawn of cinema when a few far-sighted people were experimenting with basic, jerky, but just about understandable moving images. Except that this time round, the development from there to Avatar will happen in under ten years, rather than the hundred it took first time round.
Notable cultural bias, in each case I asked for images of either "young women" or "female engineers" (and then specified outfit, mud, etc). In every case it's drawn white people.
I also found it interesting to play around with this technology. Initially I played around with Dall-E 2. While it is very good at generating hair texture and clothing, it struggles a lot with faces. Especially if you add your special WAM ingredients in the mix. Without those it also can create realistic faces.
What was already noticeable that a lot of phrases and words are not allowed cause they are not in line with their content policy. This will get worse I assume, casue those systems will be as good in identifying adult or fetish requests as they are in generating images.
I tried midjourney as well and it excells at generating faces and texture.
It looks like their approach is a bit different from the others. Not that I did any research in how their AI works, but it looks to me as if they have a more CGI based approach, meaning their images look more like being generated in a computer game. Their images look a lot like artificial movie/game characters, whereas other AI engines seem to sort of stitch and blend details of real world images together somehow.
Been a long-time lurker on here with a nondescript account name and little to say, but thought I should unlurk for this topic as I might have something useful to add. I've been doing a lot of experimentation with v4 of midjourney since it came out. Still on a learning curve with prompt design and the kind of images that can improve the quality of the results, but starting to get some promising output.
As those of you who have used it will probably know, it's rather drastically limited by NSFW moderation, particularly for source images. The tricky part for WAM stuff is that it will refuse source images with messy faces, probably because it mistakes them for something else. This is frustrating as it removes one of the best methods for improving images - feeding them back into the network to bias them in a particular direction with a second image - at the moment midjourney will often refuse its own outputs as potentially NSFW.
It's also a lot easier to make a load of mess and an attractive woman to appear in the same image than it is to get them to interact. So far I have found better results by accepting a slightly lower level of photorealism in exchange for more natural coverage and pose. The prompt "character design" is a useful starting point for this.
When including source images, I do my best to match the features in them that I want to include to the description in the prompt. Redundancy is also your friend - adding multiple words eg (slime, gunk, goo), plus descriptors like "slimed hair" seem to be worth including.
Doing cycles of upscaling to high detail and then back to the "light" upscale can be useful. The detailed upscale often improves the look at the mess, but often to the detriment of the faces, so comping images together using something like photoshop or pixlr (a free, web based tool) can be useful.
Here are some of my recent attempts. Definitely not perfect but it shows where things might eventually lead. I'm looking to get into using Stable Diffusion soon, but GPU access is a problem at the moment.