onimai-characters / README.md
alea31415's picture
Update README.md
9eac912
|
raw
history blame
4.76 kB
---
license: creativeml-openrail-m
tags:
- text-to-image
- stable-diffusion
- anime
- aiart
---
**This model is trained on 6(+1?) characters from ONIMAI: I'm Now Your Sister! (γŠε…„γ‘γ‚ƒγ‚“γ―γŠγ—γΎγ„!)**
### Example Generations
![00009-20230210181727-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00009-20230210181727-min.png)
![00041-20230210195115-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00041-20230210195115-min.png)
### Usage
The model is shared in both diffuser and safetensors formats.
As for the trigger words, the six characters can be prompted with
`OyamaMahiro`, `OyamaMihari`, `HozukiKaede`, `HozukiMomiji`, `OkaAsahi`, and `MurosakiMiyo`.
`TenkawaNayuta` is tagged but she appears in fewer than 10 images so don't expect any good result.
There are also three different styles trained into the model: `aniscreen`, `edstyle`, and `megazine` (yes, typo).
As usual you can get multiple-character imagee but starting from 4 it is difficult.
In the following images are shown the generations of different checkpoints.
The default one is that of step 22828, but all the checkpoints starting from step 9969 can be found in the `checkpoints` directory.
They are all sufficiently good at the six characters but later ones are better at `megazine` and `edstyle` (at the risk of overfitting, I don't really know).
![xyz_grid-0000-20230210154700.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0000-20230210154700.jpg)
![xyz_grid-0001-20230210155723.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0001-20230210155723.jpg)
![xyz_grid-0006-20230210163625.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0006-20230210163625.jpg)
### More Generations
![00011-20230210182642-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00011-20230210182642-min.png)
![00003-20230210175009-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00003-20230210175009-min.png)
![00005-20230210175301-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00005-20230210175301-min.png)
![00016-20230210183918-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00016-20230210183918-min.png)
![00019-20230210184731-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00019-20230210184731-min.png)
![00038-20230210194326-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00038-20230210194326-min.png)
![00039-20230210194529-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00039-20230210194529-min.png)
![00043-20230210195945-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00043-20230210195945-min.png)
![00047-20230210202801-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00047-20230210202801-min.png)
### Dataset Description
The dataset is prepared via the workflow detailed here: https://github.com/cyber-meow/anime_screenshot_pipeline
It contains 21412 images with the following composition
- 2133 onimai images separated in four types
- 1496 anime screenshots from the first six episodes (for style `aniscreen`)
- 70 screenshots of the ending of the anime (for style `edstyle`, not counted in the 1496 above)
- 528 fan arts (or probably some official arts)
- 39 scans of the covers of the mangas (for style `megazine`, don't ask me why I choose this name, it is bad but it turns out to work)
- 19279 regularization images which intend to be as various as possible while being in anime style (i.e. no photorealistic image is used)
Note that the model is trained with a specific weighting scheme to balance between different concepts so that every image does not weight equally.
After applying the per-image repeat we get around 145K images per epoch.
### Training
Training is done with [EveryDream2](https://github.com/victorchall/EveryDream2trainer) trainer with [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) as base model.
The following configuration is used
- resolution 512
- cosine learning rate scheduler, lr 2.5e-6
- batch size 8
- conditional dropout 0.08
- change beta scheduler from `scaler_linear` to `linear` in `config.json` of the scheduler of the model
I trained for two epochs wheareas the default release model was trained for 22828 steps as mentioned above.