|
--- |
|
license: creativeml-openrail-m |
|
tags: |
|
- text-to-image |
|
- stable-diffusion |
|
- anime |
|
- aiart |
|
--- |
|
|
|
**This model is trained on 6(+1?) characters from ONIMAI: I'm Now Your Sister! (γε
γ‘γγγ―γγγΎγ!)** |
|
|
|
### Example Generations |
|
|
|
![00009-20230210181727-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00009-20230210181727-min.png) |
|
![00041-20230210195115-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00041-20230210195115-min.png) |
|
|
|
### Usage |
|
|
|
The model is shared in both diffuser and safetensors formats. |
|
As for the trigger words, the six characters can be prompted with |
|
`OyamaMahiro`, `OyamaMihari`, `HozukiKaede`, `HozukiMomiji`, `OkaAsahi`, and `MurosakiMiyo`. |
|
`TenkawaNayuta` is tagged but she appears in fewer than 10 images so don't expect any good result. |
|
There are also three different styles trained into the model: `aniscreen`, `edstyle`, and `megazine` (yes, typo). |
|
As usual you can get multiple-character imagee but starting from 4 it is difficult. |
|
By the way, the model is trained at clip skip 1. |
|
|
|
In the following images are shown the generations of different checkpoints. |
|
The default one is that of step 22828, but all the checkpoints starting from step 9969 can be found in the `checkpoints` directory. |
|
They are all sufficiently good at the six characters but later ones are better at `megazine` and `edstyle` (at the risk of overfitting, I don't really know). |
|
|
|
![xyz_grid-0000-20230210154700.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0000-20230210154700.jpg) |
|
![xyz_grid-0001-20230210155723.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0001-20230210155723.jpg) |
|
![xyz_grid-0006-20230210163625.jpg](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/grids/xyz_grid-0006-20230210163625.jpg) |
|
|
|
### More Generations |
|
|
|
![00011-20230210182642-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00011-20230210182642-min.png) |
|
![00003-20230210175009-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00003-20230210175009-min.png) |
|
![00005-20230210175301-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00005-20230210175301-min.png) |
|
![00016-20230210183918-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00016-20230210183918-min.png) |
|
![00019-20230210184731-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00019-20230210184731-min.png) |
|
![00038-20230210194326-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00038-20230210194326-min.png) |
|
![00039-20230210194529-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00039-20230210194529-min.png) |
|
![00043-20230210195945-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00043-20230210195945-min.png) |
|
![00047-20230210202801-min.png](https://huggingface.co/alea31415/onimai-characters/resolve/main/example_generations/00047-20230210202801-min.png) |
|
|
|
### Dataset Description |
|
|
|
The dataset is prepared via the workflow detailed here: https://github.com/cyber-meow/anime_screenshot_pipeline |
|
|
|
It contains 21412 images with the following composition |
|
|
|
- 2133 onimai images separated in four types |
|
- 1496 anime screenshots from the first six episodes (for style `aniscreen`) |
|
- 70 screenshots of the ending of the anime (for style `edstyle`, not counted in the 1496 above) |
|
- 528 fan arts (or probably some official arts) |
|
- 39 scans of the covers of the mangas (for style `megazine`, don't ask me why I choose this name, it is bad but it turns out to work) |
|
- 19279 regularization images which intend to be as various as possible while being in anime style (i.e. no photorealistic image is used) |
|
|
|
Note that the model is trained with a specific weighting scheme to balance between different concepts so that every image does not weight equally. |
|
After applying the per-image repeat we get around 145K images per epoch. |
|
|
|
### Training |
|
|
|
Training is done with [EveryDream2](https://github.com/victorchall/EveryDream2trainer) trainer with [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) as base model. |
|
The following configuration is used |
|
|
|
- resolution 512 |
|
- cosine learning rate scheduler, lr 2.5e-6 |
|
- batch size 8 |
|
- conditional dropout 0.08 |
|
- change beta scheduler from `scaler_linear` to `linear` in `config.json` of the scheduler of the model |
|
- clip skip 1 |
|
|
|
I trained for two epochs wheareas the default release model was trained for 22828 steps as mentioned above. |