InkBunny's administration posted an update on their approach to AI-generated and AI-assited content. I fully approve them for being smarter than the admins of some other websites who lack any awareness of the reality, however InkBunny's approach reflects lack of understanding and foresight too. Let me explain all the hundred of ways the "80% of InkBunny's rules being about AI" approach can and will fail.
The hell is going on?
In case you're living on Pluto, here's a recap: 1. Big corporatins created neural networks which can generate photos and art based on text queries. DALL-E, DALL-E 2, Midjourney, eDiff-I etc. They have crazy hardware requirements, free generations are limited, access is paid, implementation is closed. 2. Open-source competition created Stable Diffusion (SD). Code is open, models are public, anyone can train models, anyone can generate art, hardware requirements are low, everything is free. However, quality is lower than from the state-of-the-art closed models. 3. Everyone and their grandma started training specialized SD models for anime, furries etc. Both open and closed models emerged. 4. Everyone and their grandma started posting generated art. 5. Everyone and their grandma started raging about stolen art, inappropriate use, the death of the art and the upcoming apocalypse. 6. Websites started thinking about how to deal with the generated art. 7. (You are here.) 8. We welcome our new computer overlords.
Wrong assumtions
Most people don't seem to understand the process of generating and modifying art, how widespread the use of tools involing machine learning is already and what this means for the future of the art. Given wrong assumtions, you come to wrong conclusions. Let me explain just a few of misunderstood points.
In the event that you used an AI tool to assist in the creation of assets or backgrounds for an otherwise manually-created piece of artwork
Wrong assumption: the artist knows which tools use machine learning and to what extent. Wrong assumption: the artist is aware of how the background was created. Wrong assumption: the artist wants to disclose this information.
It isn't uncommon to use backgrounds created by others, licensed by some sort of Creative Commons license. Creators on other platforms aren't forced to disclose information about the tools they use, thus the artist has no way of knowing whether it was AI-assisted and to what extent. You can't disclose information you don't have.
Let's assume the artist created the background themself in Photoshop. There's a high chance some of the following tools were used: select subject, select focus area, select sky, magnetic lasso, resize/transform image/selection, content-aware fill, one-click delete and fill etc. There's no "WARNING: POWERED BY AI" message on any of these dialogs. And of course, there're neural filters like noise removal, JPEG artifact removal etc. which were previously algorithmical and now use neural models. In a year, they'll be moved from the experimental neural filters list into the main menu, and almost nobody will notice. Would the artist need to disclose usage of any of these tools? Good luck explaining anyone the difference between median, smart and neural.
And finally, if the artist doesn't want to disclose this information, how do you enforce the rule? There's absolutely no way. This rule is unenforceable by nature.
In the event that you used a tool like img2img to take an input image you created and produce an AI assisted output
Wrong assumption: there's only one sketch. Wrong assumption: the sketch is easy to extract.
Whoever wrote this rule saw how neural models create images from MS Paint-style sketches. However, "img2img" is a much more complex tool. While creating an image, the artist can use inpainting models to fix various parts of the image, for example, by drawing on top of the image with rough brish strokes, masking the roughly drawn part and telling the inpainting model to improve it. Or, by erasing a part of the image. Or, by using outpainting to extend the existing image. Or, by using any of these steps a hundred times. In case anyone wonders, there're plugins for Krita and Phtoshop which enable this kind of workflow, but this can be done manually too, of course.
What the artist is supposed to upload then? Every AI-improved brush stroke? What if it requires more than whatever the current image-pre-submission limit is (50?)? What is the workflow? Is artist expected to save each step as a separate image?
The image description must contain all the prompts and seeds passed to the generator
Wrong assumption: this data is enough to reconstruct the image. Wrong assumption: this data is possible to provide.
I assume the idea here is to make reproducing the image possible, thus proving that the artists didn't lie about the inputs. This couldn't be further from truth. To reproduce the image, you also need: exact model hash (versions can vary a lot), exact CFG, exact step, exact pixel-perfect mask (in case of inpainting), various other parameters depending on the sampler used. All this is especially important in case of ancestral samplers for which one step difference can mean a completely different image.
Furthermore, currently evolving tools just don't provide this sort of data. During the process of drawing, the artist can generate hundreds of images with randomly varying prompts for a variety of models from a variety of services. In such case this requirement would be absolutely impractical and even when done to the letter, is very unlikely to provide enough data to reproduce the image, thus nullifying the sole reason for providing these details.
The image description must indicate what training data was used (if known)
Wrong assumption: there's a sensible way of getting this information.
State-of-the-art models use thousands of datasets consisting of millions of images. State-of-the-art anime and furry models rely on hundreds of automatic and manual datasets. There's no way to monitor training of even the public models, because owners train them on private services. So even if the owners of the models provide some data about the source of the datasets, this is in no way verifiable. And you can't expect every artist to monitor the datasets of every model they use. This rule is literally impossible to follow.
The image must not have been generated using prompts that include the name of a living or recently deceased (within the last 25 years) artist
Wrong assumption: lack of the artist name in the text prompt can be verified. Wrong assumption: not using an artist's name means not using data from their art. Wrong assumption: this rule limits the styles to those not used by popular artists. Wrong assumption: style can be copyrighted.
There's no way to verify provided inputs given the current rules. And if the rules are extended to provide the full data, it's very easy to make them impossible to follow just by relying on a complex workflow. (See above about plugins for Photoshop and Krita.)
Art of the artist in the dataset is used no mateter what text prompt you provide. It's just a fact. If you have a neural model trained on an artist's art, there's absolutely no way to make the model "forget" the artist.
What does this rule mean in practical terms? If the artist uses a generic prompt instead of entering some name, cherry-picking will just require more images to produce something they want. It may also be possible to just use hypernetworks, text embeddings, dreambooths etc. which lean toward a specific style, thus completely bypassing this restriction.
Futhermore, this rule operates on the assumtion that a style is a copyrightable entity. As far as I know, this is not the case in most countries of the world (correct me if I'm wrong).
Overall, the rule doesn't achieve anything, it's trivial to circumvent the rule and it reiles on incorrect understanding of the law.
you must not upload content for which you used closed-source tools or those which charge a subscription fee to access a gated model
Wrong assumption: artists are aware of the ML nature of their tools. Wrong assumption: closed-source models treat copyrighted material differently (?). Wrong assumption: this rule restricts use only of NovelAI, Midjourney and the like.
Frankly, I just don't understand the reason of differentiating between open-source closed-source models. They're produced in similar ways, they generate similar content, they ignore copyright in the same way etc. The only differences are price and quality. For me, this sounds like, "You are allowed to use GIMP, but using Photoshop is forbidden".
NovelAI and Midjourney have a distinct style which may be hard to reproduce with SD, thus hiding their usage may be problematic, but this may only be temporary. In half a year, SD will be able to generate what NovelAI models generate now. So, considering how hard it is to reproduce an image (see explanations above), you just won't be able to prove anything, making the rule unenforceable.
Another major issue with this rule is that a lot of ML tools which aren't exctly "artistic" are close-source, notably all Topaz tools. This rule disallows upscaling and frame interpolation using Topaz tools. This restriction seems random and unwarranted.
You may not upload more than six images with the same prompts in a single set
Wrong assumption: this restriction is beneficial for the site's visitors.
I can see the problem with people uploading 50 images with the same prompt, especially regarding bloating server storage. However, varying text prompts isn't that hard: "sitting", "standing", "lying", "close-up" etc. Frankly, I'd rather see this stuff within one submission than scattered across hundreds of submissions with tiny variations.
You must not sell fully AI-generated artwork adopts or commissions
Wrong assumption: the word "fully" means anything,
Is art generated from MS Paint drawing "fully AI-generated"? What about inpainting and outpainting? Do the artist's brush strokes need to be included? Overall, where's the line?
What do you propose instead?
No matter what rules you impose, they will be ignored, knowingly and unknowingly. The current rules may delay the influx of generated image spam, but trying to micromanage and impose detailed restrictions is doomed to fail. These rules can't be enforced.
I can suggest these simple rules:
1. If the image is primarily produced by a neural network with little to no editing, you MUST tag it with "ai_generated" and model/tool name. 2. DO NOT upload more than N (20?) images per day. 3. If you used neural networks for editing in a substantial way, you SHOULD tag it with "ai_assisted" and model/tool name. 4. "ai_generated" and "ai_assisted" rules apply with the same modality to commission offers.
(The rule about characters remains, ML doesn't affect it.)
Everything else is impractical, unenforceable and unhealthy. People will whine. You won't like what happens.
Deal with it.
There's no way around.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
All of this really flew past me so quick, I didn't really know how far advanced things seem to be right now. Though I don't think I've seen anything TLK related of any quality yet, where I know it's ai_generated. Maybe without knowing? I have no idea what can be done, really, even with the open (source) and / or free tools. All of this is rocket science to me. Or crypto currency. I know it's "money". Full stop.
All of this really flew past me so quick, I didn't really know how far advanced things seem to be ri
I managed to get some fun results using "modern Disney" textual inversion on top of standard Stable Diffusion 1.5. It's SFW, but can produce really cute results. This is the only combination which allowed me to get something TLK-related which doesn't look like crap.
Furry Diffusion models produce somewhat nice looking but boring pinups. Ferals, fetishes, compositions, interaction and everything else ranges from hard to impossible. It may be possible to train dreambooths/hypernetworks/textual inversions/whatever for FD which improve the quality of TLK-related art, but I lack knowledge, hardware, money and time for this.
Zack's model contains fetishistic content, but it's low quality.
NovelAI released a new furry model, which seems promising, but you need to pay for it (not a lot, but still). Haven't seen TLK generations of it, but MLP and WoW looks okay. (You can find examples on IB, but hurry.) _____________________________
High-quality FD pinups rely on artist names, which is disallowed by new stupid InkBunny rules. There're 50 times to circumvent the restriction, it just requires some time.
I may post the "modern Disney" SFW images though. These are fine according to the new rules. Simple queries, just some cherry-picking. I guess I'll have to create um... 5th InkBunny account for ML art? I kinda lost count.
I managed to get some fun results using "modern Disney" textual inversion on top of standard Stable
o_o I must say. I am kind of dumbfounded by now .... nice .... this looks. Maybe not as TLKish as I would like it to be, the sense that it'd be more cartoony, and less CGI style, but color me very impressed. These are all awfully cute. If a bit samey and simplistic to my eyes, they're still cute nonetheless.
And I wish I know anything about how these are made. I mean really. All I understand is that there is a certain computing process behind it, and that that process has a rough outline of what shapes and what shades go together, and where. It's seen a paw, it knows it has toes, and where they'll be.
So, perhaps I now understanda little better that artists seem to be very enthusiastic and upset about this. The thing is, it's not going to go away. If anything, I think, in time it'll be so powerful that we'll enjoy entire VR worlds entirely composed through these methods, and it'll be yet another revolution in porn and pornographic freedom/excesses. And family friendly stuff, too. We might get completely swamped with TV shows and games or even other kinds of entertainment, mostly comprised of this. Maybe.
Human artistry might suffer a good bit. At least the commercial bit. And maybe that is unavoidable.
o_o I must say. I am kind of dumbfounded by now .... nice .... this looks. Maybe not as TLKish as I
Where you see suffering, I see progress. Artists will just have to adapt to the new reality instead of learning once and then applying the same tools the same way over and over. Mind you, traditional artists working with real media and having a real skill won't even notice this. Or high-profile digital artists. It's the low-tier digital artists who ignore the new tools that will suffer the most, because their skill may become obsolete.
In the realm of art, I hope to see more interest in ideas, plot, emotions, story — things which a hard for computers (for now). When photography became a thing, producing an image of a realistic looking forest stopped being a proof of skill. When ML becomes a thing, pretty pinups of cute waifus on cool backgrounds will stop being a proof of skill. Artists will have to do more.
Photography caused more artistic styles to emerge. How is ML different?
Where you see suffering, I see progress. Artists will just have to adapt to the new reality instead
The point isn't that style is "copyrightable" (though in a very narrow sense, design patents exist). It's "trying to duplicate someone else's style using keywords likely to bias towards their own work specifically seems like kind of a douche move, and we don't think doing that will win AI art many friends here." It is not about what is legal, it is what is acceptable to the audience.
As I said in a comment elsewhere, the issue with proprietary models is that if you are, as you say, "ignoring copyright", you shouldn't be making money off that, and hiding your own stuff so others can't use it. Fair's fair. Other sites may have different perspectives.
For img2img, each step could be a separate image, but that seems excessive, and I'd suggest only noticeable changes. Submissions can have 104 images so there is plenty of room to show your working to the extent that seems appropriate.
I totally accept that it may not be possible to fully detail the parameters leading to a particular piece, but the intent is to get as close as possible. Specific suggestions as to how to adjust the current wording ("all prompts and seeds passed to AI tools and ... the generator, training model and version or hash used") to achieve that are welcome. Aside from reproduction and learning, the goal is also to allow people to block programs and models they do not want to see via keyword, or search for examples of their use.
The point isn't that style is "copyrightable" (though in a very narrow sense, design patents exist).
It's "trying to duplicate someone else's style using keywords likely to bias towards their own work specifically seems like kind of a douche move, and we don't think doing that will win AI art many friends here." It is not about what is legal, it is what is acceptable to the audience.
Let's say someone trains dreambooth/textual inversion/hypernetwork on top of SD with heavy bias towards a specific style by using an undisclosed dataset of undisclosed images by undisclosed artists (like 99% of "open" "improvements"). The result would be largely the same as using an artist's name, it'll just have some random token like "superawesome style" attached to it instead.
Is this allowed?
"
As I said in a comment elsewhere, the issue with proprietary models is that if you are, as you say, "ignoring copyright", you shouldn't be making money off that, and hiding your own stuff so others can't use it. Fair's fair. Other sites may have different perspectives.
So your plan is actually to reduce popularity of closed-source competition using the one and only capitalistic way of voting with money by discouraging their usage? The criterion still seems arbitrary to me, but at least it makes logical sense now. Okay then.
But out of curiosity, with these principles in mind, what's the stance on paid-for services providing access to open models like SD? They take something produced by the community, thus rely on "stolen" art, don't add anything of value, don't improve open or closed models, they just invest into hardware and get money out of thin air. To me, this seems "more evil" than NovelAI, however using these services isn't discouraged by the rules.
"
For img2img, each step could be a separate image, but that seems excessive, and I'd suggest only noticeable changes. Submissions can have 104 images so there is plenty of room to show your working to the extent that seems appropriate.
What's the point of this rule in the first place? If it's reproducing, then it's impossible. Off by one bit and you can easily get a completely different image.
~~~ Quote: It's "trying to duplicate someone else's style using keywords likely to bias towards
I think that at the current moment, the discussion will mostly rely on emotions rather than facts. This is just how human psychology works, and we should accept it rather than fight it if we want to move forward. The InkBunny admin team has to choose the rules that will primarily make the existing artists feel secure and not displaced by the new technology. There have been similar paradigm shifts in the past, such as the advent of photography, digital art, and even monetizing art (several years ago, commissions were seen as controversial in some circles). With time the emotions will subside, and the usage of ML algorithms will become normalized.
I think that at the current moment, the discussion will mostly rely on emotions rather than facts. T
This is all way too complicated. I just want to use NovelAI to sample, vibe transfer, and combine things I like with my own sketches to make and edit the images for fun, not for profit. I primarily write stories and using AI to help me visualize those stories is fun for me. Having to learn and do all of this extra legwork just to "Show my work" is unrealistic and takes all the fun out of it.
This is all way too complicated. I just want to use NovelAI to sample, vibe transfer, and combine th