|
--- |
|
tags: |
|
- stable-diffusion |
|
- stable-diffusion-xl |
|
--- |
|
|
|
# Nekoray-XL-V0.7 |
|
|
|
|
|
## Model Card |
|
|
|
|
|
NekoRay v0.7 is a SDXL checkpoint finetuned on SDXL 1.0, using 1.5m quality-tagged images from a selection of image sites. It aims to be the next Waifu-diffusion, creating generation freedom for the SD community. |
|
|
|
The project is a WIP, and further checkpoints with enhanced augmentations & more images is currently being developed. |
|
|
|
|
|
Two models trained on the same dataset/hardware/hyperparameters are currently available: |
|
|
|
- **[fp16mixed](https://huggingface.co/trojblue/nekoray-xl-fulldan-bench-1.5m/blob/main/nekoray-xl-1.5m-fp16mixed_e02.safetensors)**: 2epoch, half-precision |
|
|
|
- **[32full](https://huggingface.co/trojblue/nekoray-xl-fulldan-bench-1.5m/blob/main/nekoray-xl-1.5m-pdg32_e02.safetensors)**: 1.7epoch (still training), full-precision |
|
|
|
|
|
## Usage: |
|
|
|
It's recommended to use the *exact* same resolution as specified below, since the original SDXL doesn't perform well out of these resolutions. For prompt danbooru-styled caption is preferred. |
|
|
|
|
|
|
|
we use the same aspect ratio as the original SDXL: |
|
|
|
| Height | Width | Aspect Ratio | |
|
| ------ | ----- | ------------ | |
|
| 512 | 2048 | 0.25 | |
|
| 512 | 1984 | 0.26 | |
|
| 512 | 1920 | 0.27 | |
|
| 512 | 1856 | 0.28 | |
|
| 576 | 1792 | 0.32 | |
|
| 576 | 1728 | 0.33 | |
|
| 576 | 1664 | 0.35 | |
|
| 640 | 1600 | 0.4 | |
|
| 640 | 1536 | 0.42 | |
|
| 704 | 1472 | 0.48 | |
|
| 704 | 1408 | 0.5 | |
|
| 704 | 1344 | 0.52 | |
|
| 768 | 1344 | 0.57 | |
|
| 768 | 1280 | 0.6 | |
|
| 832 | 1216 | 0.68 | |
|
| 832 | 1152 | 0.72 | |
|
| 896 | 1152 | 0.78 | |
|
| 896 | 1088 | 0.82 | |
|
| 960 | 1088 | 0.88 | |
|
| 960 | 1024 | 0.94 | |
|
| 1024 | 1024 | 1.0 | |
|
| 1024 | 960 | ... | |
|
|
|
|
|
for prompts the following keywords are appended for better separation of genres: |
|
|
|
``` |
|
'sensitive-rated', 'questionable-rated', 'explicit-rated' |
|
``` |
|
|
|
The models are intended to be used as a pretrained checkpoint, and further finetuning is **strongly recommended** for downstream use. For more info 'Finetuning' part below: |
|
|
|
## Finetuning |
|
|
|
finetuning on SDXL inherently provides better clarity and reduced 'noisinesss' on higher resolutions compared to sd1.4. We've got some good results for further finetuning on various anime-related subject matters, including but not limited to: |
|
|
|
- style finetunes (tested on nijijourney images & pvc/figure datasets) |
|
|
|
- character finetunes (tested on 8 Blue Archive characters) |
|
|
|
- concept finetunes (tested on nsfw gestures) |
|
|
|
Actual samples of downstream finetunes will be posted once we get the finetuners' conscents. |
|
|
|
## License |
|
|
|
The model (which is still very WIP) is intended to be used as a foundation for various downstream finetunes. The license is under discussion but it would be generally follow Openrail-M agreements. |
|
|