It's still pretty hard to make AI get the concept of a thrown pie ... but it's getting somehow usable now. The physics of the cream impact and its effect on hair and surrounding is already pretty good. I am looking forward to the next iterations of tools and models.
Oh and because I get the question a lot. There's not one tool or model I use in generating these. It's a combination of tools and every sequence featured in a video takes several steps to generate. Also my work builds upon archives of my own work and generative AI material that has been refined over several iterations of the underlying tools. I assume these days most of the AI video creation tools can produce decent results, but they either require a lot of computing power if run locally or you need to use credits. In any way producing this costs money. I did not yet have a chance to use Veo3 as it's not available on my side of the big pond, but I am looking forward to testing it.
outstanding results. I know how much effort that goes into preparing the datasets, training the models and refining them. It's a thankless task much of the time. I'm curious about your workflow. Are you creating start/end frames in flux using your Loras and then using Kling etc to generate the videos before upscaling or have you gone direct to training Wan2.1 and using Txt2Vid.
I've got my hands on a new RTX pro 6000 (96Gb) which I'm using locally to train HiDream. Their full FP16 model is outstanding and trains way better than flux as far as my testing goes. I haven't got round to training Wan yet. That's on my list of things to try soon but work keeps getting in the way
messg said: outstanding results. I know how much effort that goes into preparing the datasets, training the models and refining them. It's a thankless task much of the time. I'm curious about your workflow. Are you creating start/end frames in flux using your Loras and then using Kling etc to generate the videos before upscaling or have you gone direct to training Wan2.1 and using Txt2Vid.
I've got my hands on a new RTX pro 6000 (96Gb) which I'm using locally to train HiDream. Their full FP16 model is outstanding and trains way better than flux as far as my testing goes. I haven't got round to training Wan yet. That's on my list of things to try soon but work keeps getting in the way
Thanks for the feedback. I think you are already much more sophisticated then me. I put a lot of time into preparing my base material and still images. These are my starting point usually. I created characters and refined them for over almost two years now. I used locally trained models as well as various online services as basic as bing or chatgpt. I use chatgpt to create scenarios or prompts and help it to refine my scenes. For video I use runway, Luma or kling, midjourney also will become an option. I didn't yet start to think about running video creation locally. I only have one fast GPU and I mostly use it for upscaling and refining locally. Obviously Veo3 with the optiion of adding audio will be of interest to me, but it's not available yet here and I don't want to go through the hassel of VPN and a US google account.
Very impressive. There are some fantastic subtleties not just in the pie hits but the way the models move and interact with the environment around them.