Technologies to employ when Generating

If you are here, that means you want to learn some of my 'secrets'. I gladly share them with you.

Tools

There is a couple of good generators. Here is some of them:
* ComfyUI
+ This one have my own set of Workflows.
* Automatic1111
* InvokeAI

Models

Of course, all good generations start with a nice models. Here is a couple of my favorites!
* Fluffusion (usually: Revision 1, Epoch 12 or 20)
  Pros:
    Dataset consists of ~900000 images from e621.
    Most common meta-tags included
    Training set at 640x640 size or higher.
    All tags without underscores
    Clip Skip = 2
  Cons:
    Nothing really. Maybe absence of offset noise.
* FluffyRock (usually: 1088 MegaRes, Epoch 24 with Offset Noise)
  Pros:
    Training set at 640x640 size and higher, up to 1088x1088.
    All tags without underscores
    Offset Noise gives away much more vivid images
  Cons:
    Must prepend all prompts with 'e621' tag.
    Clip Skip = 1
    Dataset consists of ~415000 images from e621.
    Meta-tags not included = harder to achieve specific styles for academics
* Furtastic V2.0
  Pros:
    Dataset consists of ~700000 images from e621.
    Common meta-tags included
    Training set at 768x768 size or higher.
    All tags without underscores
  Cons:
    Clip Skip = 1
    Somewhat depends on negative embeddings?
    Problems with separating objects and artists (for example, in fluffusion, 'trout' and 'by trout' is different items, since first one is a fish and second one is an artist)
* Yiffy Mix V2.2 and V3.1
  Pros:
    Most of Stable Diffusion items left intact... probably. So you can use tags like "Michael & Inessa Garmash". Somewhat.
    All tags without underscores.
    Creates somewhat nice images.
  Cons:
    Clip Skip = 1
    Long, convoluted prompts.
  Recipe:
    * Yiffy 2.2:
      1 step) Fluffusion E20 + Zeipher F111 - SD4 on DIFF mode 1.0 = S1
      2 step) S1 + Rev Animated v1.2.2 on SUM mode 0.4 = S2
      3 step) S2 + FluffyRock E16O1 on SUM mode 0.35 = S3
      4 step) S3 + CrossKemono2.5 on SUM mode 0.1 = YiffyMix 2.2
    * Yiffy 3.1:
      1 step) Fluffusion E20 + Zeipher F111 - SD4 on DIFF mode 1.0 = S1
      2 step) S1 + Rev Animated v1.2.2 on SUM mode 0.4 = S2
      3 step) S1 + DarkSushiMix-Darker on SUM mode 0.4 = S3
      4 step) S2 + S3 on SUM mode 1.0 = S4
      5 step) S1 + S4 on SUM mode 0.5 = S5
      6 step) S5 + FluffyRock-3M-e51o36 on SUM mode 0.35 = S6
      7 step) S6 + IndigoFurryMix_v35 Realistic - YiffyMix v2.2 on DIFF mode 0.35 = Yiffy Mix 3.1

Generation Parameters: Sampler

There a couple of samplers that works like a charm to create pretty images. Those are:
* Euler Ancestral aka. Euler A
  Is pretty unstable, and will usually generate completely different images on different amount of generative steps taken. Nice for one-shots, good for inpainting, bad for scaling. Can do nice results anywhere from 8 to 40 steps.
* DPM 2 Ancestral Karras aka. k_dpm_2_a
  Somewhat of a Starving Artist. If you give it 50-80 steps, it will probably create superdetailed picture.
* DPM++ 2 Singlestep Ancestral Karras, aka k_dpmpp_2s_a
* DPM++ 2 Multistep Karras, aka k_dpmpp_2m
* DPM++ SDE Karras, aka k_dpmpp_sde
  These three produce detailed pictures in about 24 to 48 steps. If you need to make more furry-noisy-grainy image - use ancestral version. If you need to produce stable, sharp picture - try multistep. And if you need somewhat more artistic approach, then use SDE.

Generation Parameters: Steps

Well, it always depends on how good of a computer you have. If you have weak PC, but wish to iterate quickly through pictures - use less steps. But beware - even non-ancestral images will have changes at high enough difference in steps, so 24-step DPM++2M will differ from 48-step DPM++2M.

But general rule of thumb - the closer you to 48-72, the better.

Generation Parameters: CFG / Dynamic Threshold

One of Special Arts of Generation. You see, in A1111 tool, there is a specific addon, called Dynamic CFG Threshold. This is the link for this addon: https://github.com/mcmonkeyprojects/sd-dynamic-threshol...

It allows you to set high CFG, without 'burning up' the initial image. High CFG means - more details. More details means - better and crispier image.

Try starting with: CFG 15-20, Dynamic threshold enabled, Mimic Scale: 7, Threshhold percenile: 100, Mimic mode: Half Cosine Up, Mimic scale minumum: 4, CFG Mode: Half Cosine Up, CFG scale minimum: 4 to get quite awesome results.

You may also want to try low CFG instead. Try 3-5 CFG for your generation, you may encounter a lot of magic. You always can add more details to image pretty easely.

Good middle ground is 7-9. I prefer 7.

Increasing amount of details

* Download a model called Upscaler 4x-UltraSharp. It will allow you to, well, upscale your initial image from 640x640 to whopping 1280x1280 without significant loss in quality.
* Then, load your image (big one, that you just upscaled) into img2img.
* And then, set same prompt as you used to generate it, but you may want to play with denoising strength and CFG - start from 0.5 STR and about 7-11 CFG. Also, set higher amount of steps, if possible.

Results may, and probably will, pleasantly surprise you.

Alternate method of increasing amount of details

Use my Artistic Lora Generator. But instead of grabbing random pictures, set tags to meet your requirements. If you want a nice-looking feral fox with juicy tail, then set something like --tag 'fox feral orange_fur fluffy_tail score:>100', and you will most likely get a LoRA that really improves your foxies. Combine with everything above to get even better results.

Increasing amount of artistism

Clip Skip. This is a setting that somehow magically turns your unpleasant, sharp, childish drawings into more professional-looking pictures. Try to set it to 2, try to set it to 1. Both ways work pretty enchanting.

Increasing amount of tags

Well, that is simple. Go to any imageboard and look for pictures with a lot of tags. Maybe more pictures - 2 to 5, if nessesary. Grab all tags from there and shove them into prompting - every tag will give you more and more details, and much much better look.

Clean up!

* If you have no skills, then use Photoshop. "Content-Aware Fill" will remove nasty details from your image.
* If you have a bit of skill, then use Inpainting inside A1111 or InvokeAI. Regenerate bad parts until they become good. Eventually spray some same-colored paint.
* If you have a more than a bit of skill, use Krita/SAI/any other painting tool, and to smoothen your changes, use img2img... or do not use at all, if you really like your manual changes.
* If you have a lot of skill... then you probably don't really need AI except for palettes and moods.

Picture cleaned of stray artifacts and hand errors looks much more pleasant! Of course, you can always use 'oil painting' drawing style. :)

Tags

Always be aware of words and tags you are able to use. For example, I mostly use model called Fluffusion. I know couple of things: It was trained on 900k images, it was trained on all tags without underscores (spaces only). I also have a list of tags that it was trained on.
The Thing is, that at Inkbunny you cannot use author names in prompting. But you may reference their styles indirectly. Look at those tags below. Those are meta-tags, that will allow you to gain better control over your picture:

Tags that control 'overall quality'

absurd res - consider it like 'masterpiece, best quality' for NovelAI.
hi res
explicit

Tags that control mediums

digital media (artwork)
digital drawing (artwork)
3d (artwork)
traditional media (artwork)
digital painting (artwork)
graphite (artwork)
pencil (artwork)
pen (artwork)
pixel (artwork)
photography (artwork)
watercolor (artwork)
marker (artwork)
colored pencil (artwork)
pastel (artwork)
crayon (artwork)
gouache (artwork)
acrylic painting (artwork)
line art
hatching (art)
concept art

Tags that control detalization

cel shading
flat colors
soft shading
restricted palette
shaded
colored
greyscale
monochrome
sketch
black and white
colored sketch
spot color
simple shading
guide lines
sepia
mixed media
photomorph
manga
cross-hatching
color contrast
high contrast

Tags that control layout

portrait
widescreen
full-length portrait
three-quarter portrait
bust portrait
headshot portrait
model sheet
expression sheet
cover art
cover page
magazine cover
album cover
book cover
dvd cover
comic
sequence
multiple images
multiple scenes
sketch page
one page comic

Tags that control specific details

english text
japanese text
korean text
chinese text
signature
watermark
artist name
dated
artist logo

Tags that control colors

blue and white
black and grey
purple and white
red and white
pink and white
brown and white
green and white
orange and white
yellow and white
cool colors
partially colored
colorful
ancient art

Viewed:	214 times
Added:	2 years, 6 months ago 03 May 2023 23:12 CEST

2 Comments on this Journal

Help

Information

Policy & Legal

All artwork and other content is copyright its respective owners.

Powered by Harmony 'Gravitation' Release 80.

Content Server: Los Angeles Cache - provided by Inkbunny Donors. Background: Blank Gray.

The Inkbunny web application, artwork, name and logo are copyright and trademark of their respective owners.