Update deresute/release_notes/v1.3.md
Browse files
deresute/release_notes/v1.3.md
CHANGED
@@ -315,6 +315,13 @@ Tags generated with [SmilingWolf/wd-v1-4-convnextv2-tagger-v2](https://huggingfa
|
|
315 |
The data source also provides info about which other characters are also present in the card art ("cameos"), so following [a previous model](https://civitai.com/models/27600/)
|
316 |
the caption template `MainChar/OtherChar1/OtherChar2..., other tags shuffled` was used. However, it worked poorly this time; character attributes blend noticeably.
|
317 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
318 |
|
319 |
Training cost: ~7 T4-hours (in addition to v1.1 and v1.2 costs)
|
320 |
|
@@ -438,8 +445,7 @@ cache_latents = false
|
|
438 |
optimizer_type = "AdamW"
|
439 |
learning_rate = 0.001
|
440 |
max_grad_norm = 1.0
|
441 |
-
|
442 |
-
lr_warmup_steps = 100
|
443 |
|
444 |
[dataset_arguments]
|
445 |
debug_dataset = false
|
@@ -452,7 +458,7 @@ save_every_n_epochs = 1
|
|
452 |
max_token_length = 225
|
453 |
mem_eff_attn = false
|
454 |
xformers = true
|
455 |
-
max_train_epochs =
|
456 |
max_data_loader_n_workers = 1
|
457 |
persistent_data_loader_workers = false
|
458 |
gradient_checkpointing = false
|
|
|
315 |
The data source also provides info about which other characters are also present in the card art ("cameos"), so following [a previous model](https://civitai.com/models/27600/)
|
316 |
the caption template `MainChar/OtherChar1/OtherChar2..., other tags shuffled` was used. However, it worked poorly this time; character attributes blend noticeably.
|
317 |
|
318 |
+
Inspired by [a 2022 paper](https://openreview.net/pdf?id=Uad23IcIEs), a custom LR scheduler was used:
|
319 |
+
```
|
320 |
+
SequentialLR(optimizer, [
|
321 |
+
LinearLR(optimizer, 0.1, total_iters=steps_per_epoch),
|
322 |
+
CosineAnnealingWarmRestarts(optimizer, steps_per_epoch*2, T_mult=2)
|
323 |
+
], [steps_per_epoch])
|
324 |
+
```
|
325 |
|
326 |
Training cost: ~7 T4-hours (in addition to v1.1 and v1.2 costs)
|
327 |
|
|
|
445 |
optimizer_type = "AdamW"
|
446 |
learning_rate = 0.001
|
447 |
max_grad_norm = 1.0
|
448 |
+
# custom lr schedule
|
|
|
449 |
|
450 |
[dataset_arguments]
|
451 |
debug_dataset = false
|
|
|
458 |
max_token_length = 225
|
459 |
mem_eff_attn = false
|
460 |
xformers = true
|
461 |
+
max_train_epochs = 15
|
462 |
max_data_loader_n_workers = 1
|
463 |
persistent_data_loader_workers = false
|
464 |
gradient_checkpointing = false
|