Marvin
Initial commit
f35cffd unverified
---
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-longt5-large-128
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 6.99
- name: F1
type: f1
value: 28.39
- name: ROUGE-1
type: rouge1
value: 28.96
- name: ROUGE-2
type: rouge2
value: 11.91
- name: ROUGE-L
type: rougel
value: 27.92
- name: ROUGE-Lsum
type: rougelsum
value: 27.91
- name: Exact Match
type: exact_match
value: 0.95
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# german-jeopardy-longt5-large-128
This model is a fine-tuned version of [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 2.6149
- Brevity Penalty: 0.9386
- System Length: 19554
- Reference Length: 20793
- ROUGE-1: 28.96
- ROUGE-2: 11.91
- ROUGE-L: 27.92
- ROUGE-Lsum: 27.91
- Exact Match: 0.95
- BLEU: 6.99
- F1: 28.39
## Model description
See [google/long-t5-tglobal-large](https://huggingface.co/google/long-t5-tglobal-large) for more information about the
model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
## Intended uses & limitations
This model can be used for question generation on German text.
## Training and evaluation data
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 2
- eval_batch_size: 2
- seed: 7
- gradient_accumulation_steps: 64
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
| 7.5882 | 0.99 | 72 | 5.6823 | 3993 | 105 | 0 | 0 | 14790 | 12586 | 10382 | 8178 | 26.998 | 0.8343 | 0.0048 | 0.0031 | 0.6461 | 14790 | 21250 | 0.1101 | 0.0077 | 0.1078 | 0.1076 | 0.0 | 0.0872 | 9.7105 | 0.1155 |
| 5.2903 | 1.99 | 145 | 4.8721 | 3827 | 229 | 32 | 0 | 18894 | 16690 | 14486 | 12282 | 20.2551 | 1.3721 | 0.2209 | 0.0041 | 0.8828 | 18894 | 21250 | 0.0924 | 0.015 | 0.091 | 0.0909 | 0.0 | 0.351 | 16.7005 | 0.0964 |
| 4.6636 | 3.0 | 218 | 4.2806 | 3638 | 174 | 21 | 0 | 15268 | 13064 | 10860 | 8656 | 23.8276 | 1.3319 | 0.1934 | 0.0058 | 0.6758 | 15268 | 21250 | 0.0884 | 0.012 | 0.0876 | 0.0874 | 0.0 | 0.2933 | 8.9197 | 0.0925 |
| 4.2229 | 4.0 | 291 | 3.9210 | 4274 | 240 | 24 | 0 | 29308 | 27104 | 24900 | 22696 | 14.583 | 0.8855 | 0.0964 | 0.0022 | 1.0 | 29308 | 21250 | 0.0894 | 0.0109 | 0.0849 | 0.0849 | 0.0 | 0.2288 | 24.7015 | 0.1023 |
| 3.9434 | 4.99 | 363 | 3.6907 | 3652 | 218 | 35 | 1 | 16442 | 14238 | 12034 | 9830 | 22.2114 | 1.5311 | 0.2908 | 0.0102 | 0.7465 | 16442 | 21250 | 0.0856 | 0.0141 | 0.0843 | 0.0842 | 0.0 | 0.4204 | 12.3049 | 0.0898 |
| 3.6152 | 5.99 | 436 | 3.4603 | 4103 | 341 | 77 | 11 | 20581 | 18377 | 16173 | 13969 | 19.9359 | 1.8556 | 0.4761 | 0.0787 | 0.968 | 20581 | 21250 | 0.107 | 0.019 | 0.1023 | 0.1024 | 0.0 | 1.0505 | 14.3607 | 0.112 |
| 3.3814 | 7.0 | 509 | 3.2883 | 4342 | 675 | 218 | 43 | 17763 | 15559 | 13355 | 11151 | 24.4441 | 4.3383 | 1.6323 | 0.3856 | 0.8218 | 17763 | 21250 | 0.1264 | 0.0353 | 0.1234 | 0.1234 | 0.0005 | 2.3489 | 10.2418 | 0.1308 |
| 3.1711 | 8.0 | 582 | 3.0988 | 4820 | 856 | 246 | 44 | 19759 | 17555 | 15351 | 13147 | 24.3939 | 4.8761 | 1.6025 | 0.3347 | 0.9273 | 19759 | 21250 | 0.1503 | 0.0465 | 0.1455 | 0.1457 | 0.0005 | 2.6207 | 14.3249 | 0.1547 |
| 3.0147 | 8.99 | 654 | 2.9540 | 5167 | 1066 | 321 | 76 | 18725 | 16521 | 14317 | 12113 | 27.5941 | 6.4524 | 2.2421 | 0.6274 | 0.8739 | 18725 | 21250 | 0.1773 | 0.0588 | 0.1721 | 0.1721 | 0.0018 | 3.4764 | 14.3067 | 0.1816 |
| 2.7829 | 9.99 | 727 | 2.8288 | 5625 | 1267 | 420 | 124 | 17327 | 15123 | 12919 | 10715 | 32.4638 | 8.378 | 3.251 | 1.1573 | 0.7974 | 17327 | 21250 | 0.2127 | 0.0741 | 0.2067 | 0.2065 | 0.0045 | 4.5099 | 12.9741 | 0.2159 |
| 2.6093 | 10.99 | 800 | 2.7177 | 6005 | 1469 | 528 | 181 | 18625 | 16421 | 14217 | 12013 | 32.2416 | 8.9459 | 3.7139 | 1.5067 | 0.8685 | 18625 | 21250 | 0.229 | 0.0827 | 0.2215 | 0.2213 | 0.0064 | 5.5051 | 14.4791 | 0.231 |
| 2.453 | 12.0 | 873 | 2.5914 | 6396 | 1744 | 664 | 246 | 18307 | 16103 | 13899 | 11695 | 34.9375 | 10.8303 | 4.7773 | 2.1035 | 0.8515 | 18307 | 21250 | 0.2553 | 0.0998 | 0.2479 | 0.2478 | 0.0059 | 6.6865 | 13.7142 | 0.2565 |
| 2.3329 | 12.99 | 945 | 2.4993 | 6673 | 1888 | 741 | 291 | 18451 | 16247 | 14043 | 11839 | 36.1661 | 11.6206 | 5.2767 | 2.458 | 0.8592 | 18451 | 21250 | 0.2747 | 0.1114 | 0.2652 | 0.2652 | 0.0091 | 7.383 | 14.1751 | 0.2749 |
| 2.1663 | 13.99 | 1018 | 2.4196 | 6953 | 2052 | 834 | 337 | 18531 | 16327 | 14123 | 11919 | 37.5209 | 12.5681 | 5.9053 | 2.8274 | 0.8635 | 18531 | 21250 | 0.2886 | 0.1215 | 0.2773 | 0.277 | 0.0082 | 8.1343 | 14.6783 | 0.2889 |
| 2.0422 | 14.99 | 1091 | 2.3703 | 6968 | 2089 | 862 | 365 | 17984 | 15780 | 13576 | 11372 | 38.7456 | 13.2383 | 6.3494 | 3.2096 | 0.8339 | 17984 | 21250 | 0.2961 | 0.1268 | 0.2858 | 0.2857 | 0.0113 | 8.4322 | 13.6987 | 0.2951 |
| 1.9245 | 16.0 | 1164 | 2.3217 | 7500 | 2353 | 999 | 446 | 19017 | 16813 | 14609 | 12405 | 39.4384 | 13.9951 | 6.8383 | 3.5953 | 0.8892 | 19017 | 21250 | 0.3149 | 0.1407 | 0.3017 | 0.3017 | 0.0132 | 9.5973 | 14.77 | 0.314 |
| 1.8216 | 17.0 | 1237 | 2.2705 | 7444 | 2357 | 1044 | 488 | 18219 | 16015 | 13811 | 11607 | 40.8584 | 14.7175 | 7.5592 | 4.2044 | 0.8467 | 18219 | 21250 | 0.3201 | 0.1437 | 0.3081 | 0.3077 | 0.0132 | 9.9557 | 13.8031 | 0.3181 |
| 1.7503 | 17.99 | 1309 | 2.2386 | 7571 | 2487 | 1114 | 515 | 18275 | 16071 | 13867 | 11663 | 41.4282 | 15.4751 | 8.0335 | 4.4157 | 0.8498 | 18275 | 21250 | 0.3289 | 0.1512 | 0.3153 | 0.3151 | 0.0145 | 10.4354 | 13.9106 | 0.3265 |
| 1.6342 | 18.99 | 1382 | 2.2183 | 7697 | 2536 | 1155 | 537 | 18129 | 15925 | 13721 | 11517 | 42.4568 | 15.9246 | 8.4178 | 4.6627 | 0.8418 | 18129 | 21250 | 0.3342 | 0.1559 | 0.3224 | 0.3222 | 0.0177 | 10.7447 | 13.8494 | 0.3313 |
| 1.5474 | 19.79 | 1440 | 2.1956 | 7879 | 2632 | 1187 | 570 | 18815 | 16611 | 14407 | 12203 | 41.8762 | 15.8449 | 8.2391 | 4.671 | 0.8786 | 18815 | 21250 | 0.3398 | 0.1607 | 0.326 | 0.326 | 0.0177 | 11.1066 | 14.5136 | 0.3375 |
### Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3