Hello all. One of my favorite parts of WAM is the slow and continual buildup of mess in a scene, and that is something that AI struggles to achieve. I like to use ai to visualize how a scene might look before I shoot it in real life or to generate an image of something improbable to see in a scene. I am also training to be a scientist, so I decided to preform an experiment to see what effects continuity between prompts and what can help reinforce it.
Abstract For this experiment, I opted to use a single substance and a single outfit recolored for the different models in order to keep things relatively simple. The model is chat GPT because I have a free trial month of Plus, so the results should be easily replicable by anyone for free. The personality is set to the default settings and in a fresh chat. There is no hypothesis because I was just kinda fuckin around and seeing what stuck to the wall.
Results & discussion Some things I found to help build continuity: prompting the ai to generate a picture of the subject posing after the mess seems to reinforce the mess into the prompt. for example, generating an image of frosting being sprayed on the subject, then generating an image of the subject posing seems to bake the mess into the prompt for the next time the subject is in an image. the mess even stays in similar patterns in the same parts of the dress. The shorter the prompts the better. instead of being specific about the subject being messier, saying more seems to to that while keeping the original pose. Problems I came across: As the chat got longer the continuity started to slowly shift. This is inevitable playing a long game of telephone. As the chat got longer the AI had trouble keeping track of the mess and it would start to "forget" things. This became more of an issue when the blonde subject enters, I had to handhold the AI and make the prompts more detailed. keeping a subject in a similar pose can bake that into the prompt, which takes effort to reverse.
Conclusion This went a lot better than expected! just see for yourself! I want to continue this chat with more frosting sprayed on the subject with curly hair and ending with them spraying champagne on each other to see if i can get the AI to indirectly make their makeup run because it gets iffy when I prompt that. Obviously this doesn't look as good as the real thing but I don't have the money to buy my partner a dress of this style with the gloves nor would we feel comfortable having someone else get messy with us so for now this stays on the screen. Don't hesitate to ask me questions!
When I roleplay/write interactive stories with ChatGPT or Gemini, I like to pause after a detailed description of an outfit, character, activity, scene, etc and ask it to create a photo of it. Doing it my way, I can usually only get a handful of continiuity images before it loses track of certain elements or just creates a new image of the character, losing all progress. Then I have to delete the last couple prompts and responses to get it back on track, and by that time, I've lost interest in the characters/story or have to stop due real-life commitments (bah... work... lol).
This is pretty impressive that you were able to keep it on track long enough to do this. A lot of times for me, just adding a new character will make it decide to start from scratch the next time I ask for an image.
Also, I really like your choice of models and the outfit. Admittedly, I kinda lost interest once the faces started to drift and morph/distort, but that's just because faces are my primary focus in images. But I really like the redhead/dark red velvet combo, and the brunette in dark blue was great too. Well, all of them were great! lol But the faces were already changing before the blonde was introduced, and were pretty much gone by the time the black-haired woman joined.
One thing that I do to help maintain continuity is regularly reminding the AI of important details (also I gave an explicit general instruction to remind itself through excessive descriptors, which are far beyond overkill for any type of writing for a human audience), and occasionally reuploading the initial character image so it keeps the faces from drifting too much. This can sometimes reset the mess though, so you might also need to reupload the most recent messy image as well and tell it to use the face/outfit/character/etc from the initial image and the mess from the second image.
To be honest, though... Once it drifts or derails, the best option is to just reset the conversation back to the last good image.
It's interesting because I barely described the mess when asking it to generate an image. I started with the fresh image of a clean woman Becasue that will always get you the best looking image, and I just kept asking for the subject to do x or y. I found that the more detailed I got, the more trouble the AI had with keeping it within regulations, so the best thing was to keep it vague. I never asked it to describe something in text Becasue I was sure that would throw off whatever internal prompting mechanism was going on. You are most certainly right about the faces and makeup shifting, and that is probably what I will focus on next. The makeup shifting throws me off and I wish it would stay the same but I think there is a problem with doing that and having mess on the face at the same time. It's funny you specifically point out the color of the subjects dress, because at first that is the one thing I did not specify. I just said a redheaded woman in a strapless velvet minidress with gloves and stockings, the ai chose to go with a red dress. I also think the fashion shoot is a good setting becasue the background is very plain letting the ai put all the processing into the subjects
The problem you will always face using gemini or GPT conversations is the lack of control. Some times it'll work, sometimes it won't but you are at the whim of the underlying instructions. In GPT you can improve things by creating style and character anchor images that are always referenced and custom instructions specifically to follow sequence generation.
The ideal way of achieving sequences is to use the api with a custom instruction set, thought signatures, Anchor images and a feedback mechanism that uses the previously generated image as the next input.
The other big thing to remember is not to give the model free reign. You want to describe the change delta from the previous image only.
As an example, I used NanobananaPro to create this sequence.
Obviously this is an ideal approach but that doesn't mean you can't get close to this in GPT or Gemini but you need to leverage what's available.
messg said: The problem you will always face using gemini or GPT conversations is the lack of control. Some times it'll work, sometimes it won't but you are at the whim of the underlying instructions. In GPT you can improve things by creating style and character anchor images that are always referenced and custom instructions specifically to follow sequence generation.
The ideal way of achieving sequences is to use the api with a custom instruction set, thought signatures, Anchor images and a feedback mechanism that uses the previously generated image as the next input.
The other big thing to remember is not to give the model free reign. You want to describe the change delta from the previous image only.
As an example, I used NanobananaPro to create this sequence.
Obviously this is an ideal approach but that doesn't mean you can't get close to this in GPT or Gemini but you need to leverage what's available.
Wow, great pics.
The edge of the goo doesn't look that real, but the way the girl looks, her legs and toes, that shine on her pantyhose and the background... well done More of her and her messy legs please