File size: 2,389 Bytes
332d87e
 
 
01c8fd3
332d87e
 
 
 
 
7f210b0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2fd3910
 
 
ae38967
2fd3910
 
ae38967
 
2fd3910
 
 
 
 
 
 
 
92e13a7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
## flux-hasui-lora-d4-sigmoid-raw-gs1.0.safetensors
Experimental LoRA for FLUX.1 dev.

Trained with `sd-scripts` (Aug. 11) `sd3` branch. __NOTE:__ This settings requires > 26GB VRAM. Please add `--fp8_base` to enable fp8 training to reduce VRAM usage.

```
accelerate launch  --mixed_precision bf16 --num_cpu_threads_per_process 1 flux_train_network.py --pretrained_model_name_or_path flux1/flux1-dev.sft --clip_l sd3/clip_l.safetensors --t5xxl sd3/t5xxl_fp16.safetensors --ae flux1/ae_dev.sft --cache_latents_to_disk --save_model_as safetensors --sdpa --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 42 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --network_module networks.lora_flux --network_dim 4 --optimizer_type adamw8bit --learning_rate 1e-3 --network_train_unet_only --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk  --highvram --max_train_epochs 4 --save_every_n_epochs 1 --dataset_config hasui_1024_bs1.toml --output_dir flux/lora --output_name lora-name --timestep_sampling sigmoid --model_prediction_type raw --guidance_scale 1.0
```

.toml is below.
```.toml
[general]
flip_aug = true
color_aug = false

[[datasets]]
enable_bucket = true
resolution = [1024,1024]
bucket_reso_steps = 64
max_bucket_reso = 2048
min_bucket_reso = 128
bucket_no_upscale = false
batch_size = 1
random_crop = false
shuffle_caption = false

  [[datasets.subsets]]
  image_dir = "path/to/train/images"
  num_repeats = 1
  caption_extension = ".txt"
```


## sdxl-negprompt8-v1m.safetensors
Negative embeddings for sdxl. Num vectors per token = 8
 
## stable-cascade-c-lora-hasui-v02.safetensors
Sample of LoRA for Stable Cascade Stage C.

Feb 22, 2024 Update: Fixed a bug that LoRA is not applied to some modules (to_q/k/v and to_out) in Attention.

__This is an experimental model, so the format of the weights may change in the future.__

- a painting of an anthropomorphic penguin sitting in a cafe reading a book and having a coffee --w 1024 --h 1024 --d 1
![sample1](penguin.png)

- a painting of japanese shrine in winter with snowfall --w 832 --h 1152 --d 1234
![sample2](shrine.png)

This model is trained with 169 images with captions. U-Net only, dim=4, conv_dim=4, alpha=1, lr=1e-3, 4 epochs, mixed precision bf16, 8bit AdamW, batch size 8, resolution 1024x1024 with aspect ratio bucketing. VRAM usage is approximately 22 GB.