lora-training / kazusa /tagging methodology.md
khanon's picture
updates preview images
4b7c79c

Tagging methodology for Kazusa (blue archive)

README / Intro

Since I've seen a few people share this already I'll provide this disclaimer.

This is not really intended to be a guide, it's just a log/checklist of my process, for my own benefit, since I repeat this for a lot of LoRAs and I got tired of winging it every single time. I've put only the slightest amount of effort into making it accessible to others.

I don't claim that any or all of these optimal, nor can I confidently put them forth as recommendations. They're literally just a record of the steps I follow while tagging, gradually developed after ~16 characters using some version of the below process.

Still, I can at least point to my pre-Koharu LoRAs (which used pure WD1.4 tags) and the ones that came after (where I started heavily editing tags) and see a steady progression in quality and prompting flexibility despite using mostly the same training settings for each one.

Yes, it takes forever to do all of this shit. No, I don't recommend it unless you're extremely autistic; raw WD1.4 tags are probably good enough for most people. If you intend to do this for more than a few characters, I strongly recommend learning Hydrus it makes all of this way, way less tedious compared to doing it with crappier tools.


Prep

  • Scraped 1girl kazusa_(blue_archive) order:popularity from sancom, curated for quality, then exported from Hydrus to feed into WD1.4 Tagger.
    • Kazusa has a shitload of good art so I had to be very picky to get down to 280 images, which is still a lot. In hindsight I think huge datasets aren't really a problem; they let you train for longer without overfitting.
    • Gelbooru is probably fine too. Danbooru sucks for ロリ unless you have Gold.
    • I also got a few newer images from pixiv, don't remember which ones.
  • Exported final images from Hydrus to feed into WD1.4 Tagger
  • Auto-tagged with WD1.4 Swinv2 at 0.25 confidence
  • Reimported images+tags into Hydrus using the .txt sidecar feature. I strongly recommend putting WD1.4 tags in a separate tag domain so they aren't mixed in with shit scraped from boorus.

Tagging

  • Tag unique features
    • halo / demon horns / low wings
    • Remove when not present or out of view. WD1.4 likes putting halo even on images where no halo is visible.
    • Kazusa: halo / animal ears
      • Pruned extra ears as it seems redundant and intrinsic to the character.
  • Tag outfit variants with a single master tag
    • Kazusa:
      • Uniform: school uniform / black jacket
        • Sometimes the jacket appears without anything else, which was not tagged school uniform
      • Non-canon costumes
        • Add alternate costume
    • Nudity (WD1.4 usually does this accurately)
      • nude / completely nude
  • Prune eye colors
    • Keep tags which describe unusual eye features (multicolored eyes, heterochromia, slit pupils) as they can otherwise be too subtle and inconsistently drawn for the AI to notice
  • Prune hair colors
    • This includes two-toned hair, gradiant hair, etc. The AI learns all of these very consistently without the tags, likely because artists tend to draw them consistently
  • Partially prune hair styles
    • Leave key, defining style tags like twintails, ponytail, short hair with long locks, twin braids, etc.
    • Prune exceedingly common tags like bangs / sidelocks / eyebrows visible through hair / hair between eyes, etc.
      • Somewhat arbitrary, but I just don't think there's much value in them because they're ubiquitous and caption space is limited
    • Prune length, except for images which differ from the character's usual length
      • If you don't do this, it's more likely to get the hair length wrong when not prompted, which isn't a huge deal.
      • Add alternate hairstyle and/or alternate hair length on applicable images, which can be used to more easily change styles while prompting
    • Kazusa: short hair, colored inner hair -- while I would usually prune these, they're really her only defining hairstyle traits
  • Fixup hair ornaments
    • Prune generic hair ornament in favor of more specificity
      • hairclip / black headband / hair flower / hair ribbon, etc.
    • Consolidate tags that have color variants (headband >> black headband)
    • Kazusa: hairclip
  • Consolidate outfits
    • Only tag an item when it is actually visible. If it is only barely visible along the edge of an image, keep in mind it may be cropped during bucketing.
    • Danbooru's wiki entry for a character often provides a good list of tags for a character's entire outfit.
    • Kazusa outfits:
      • School Uniform
        • black choker
        • hooded jacket
        • black jacket
        • green sailor collar
        • pink neckerchief
        • miniskirt
        • pleated skirt
        • white skirt
        • black pantyhose
        • sneakers
  • Fixup sleeves
    • ie. long sleeves / puffy long sleeves / detached sleeves
    • You only need one, but pick one and be consistent. If sleeves aren't tagged the AI tends to add them inappropriately (such as when prompting for sleeveless outfits or nudity)
  • Fixup collars
    • ie. detached collar / collared shirt / choker / etc.
    • Same deal as sleeves, they tend to appear when unwanted if not consistently tagged according to actual visibility
  • Fixup clothing state
    • ie. open jacket / open shirt / partially undressed / off shoulder
    • The tagger is generally good at this but it can help to double-check for weird outfits
  • Tag expressions
    • This is tedious and the autotagger doesn't help you out much, but tagging these can really help the AI nail multiple iconic expressions for a character
    • Start by searching for images without one of these, and add them.
      • open mouth
      • closed mouth
      • parted lips
        • Sometimes applies with open mouth
    • Then proceed through each image and add one of these
    • smile / light smile / :d / grin (exposed teeth only)
    • :o / :< / expressionless / serious
    • wavy mouth / embarrassed
    • pout / :t / tsundere
    • nervous / nervous smile
    • flustered / swirly eyes / @_@
    • surprised / o_o / wide-eyed
    • upset / annoyed / frustrated / v-shaped eyebrows
    • naughty face / seductive smile
    • smug / :3 / smirk
    • yelling / frown
    • eyes closed / one eye closed
      • WD1.4 almost always gets these two
  • Tag camera angles/composition
    • Most of these aren't very high value, but from x can be helpful.
    • cowboy shot
    • upper body
    • full body
    • portrait
    • feet out of frame
    • cropped torso / cropped legs
    • from side / from above / from below / from behind
  • Tag iconic poses, actions, or props
    • Props need to show up often in training data for this to be worth it.
    • v / peace sign / standing on one leg
    • holding dango / weapon case / fashion magazine
    • Kazusa
      • mouth hold
      • eating
      • macaron
  • Flip through each image and use Hydrus's "related tags" feature to quickly identify important tags that might be missing.
    • This feature looks at other images with similar tags to provide suggestions. Good for spotting things you or the tagger might have missed.