So I haven't completely fallen off the radar, I've been working diligently to try and create a new V3 set for a new year and I'm finally getting some results :D. Nothing to show yet, so far it still looks significantly worse than V2 but I have brand new models to aim this towards AND brand new tooling. I recently swapped windows out for linux and that killed off booru tag manager for me, forcing me to learn Tag UI, and Python 3.12 is forcing me to make the switch over to Forge UI and One Trainer, too! Getting this to run at all was a big step!
New tooling aside a years worth of work has given me quite a solid base to use as a training set. While I'm still fiddling with what to include or exclude, the base set has 120 images with different shades of grey background. Because I control the base images from CSP, I also was able to save Dante directly as a transparent png and I can use that mask him for training purposes in One Trainer (which supports masked training).
One of the harder problems I've run into is trying to properly tag these, I get about 75 tokens per image to work with before One Trainer starts yelling at me. I'm lucky in that the modern WD taggers seem to tag the images really well, but I'm constantly chopping out tags to keep it below that level and for furry characters, tags build up fast.
I'm still finding lots of contradictory information about how these data sets should be tagged, with the two sets of seemingly contradictory guidance being "Don't include anything associated with the character in the tags" to "include all the key aspects of the character you want the lora to change". To include fennec or not to include fennec in the keywords :/.
While I continue to hope for quality improvements, I think the best return I'm going to get from this generation of models is prompt coherence and flexibility. The newer models are really good at following directions (moreso than older ones) and can even let me describe multiple characters in the same image. This could lead a to a TON of new fun images of Dante. I've always faced limitations in guiding the images for Dante in the past and had to resort to different tricks to get the results I wanted. Maybe there is a chance to get a wider variety of pictures from these new methods without sacrificing quality (or where I can least use two generations of models to perfect the quality on Dante on a second pass?)
Overall, the new data set at this time has about 120 images, some monochromatic and other focusing in on key parts of Dante. Every single one was created from my set of AI-Assisted art from the past, meaning that my edits and changes are the only place where the AI entered the creation of this data set. Many of the early AI images weren't even up to standard to make the cut and the result is a very homogeneous data set in terms of sty and character features.
I have a lot of work to do, but here's hoping that v3 results in some more great adventures for my fuzz fuzz that everyone can enjoy!
Squeak!
- Dante
Viewed: |
9 times |
Added: |
4 months, 2 weeks ago
23 Feb 2025 07:27 CET
|
|