metadata
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-mt5-base
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 14.56
- name: F1
type: f1
value: 39.53
- name: ROUGE-1
type: rouge1
value: 40.62
- name: ROUGE-2
type: rouge2
value: 21.49
- name: ROUGE-L
type: rougel
value: 39.14
- name: ROUGE-Lsum
type: rougelsum
value: 39.13
- name: Exact Match
type: exact_match
value: 2.72
german-jeopardy-mt5-base
This model is a fine-tuned version of google/mt5-base on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:
- Loss: 1.66
- Brevity Penalty: 0.9025
- System Length: 18860
- Reference Length: 20793
- ROUGE-1: 40.62
- ROUGE-2: 21.49
- ROUGE-L: 39.14
- ROUGE-Lsum: 39.13
- Exact Match: 2.72
- BLEU: 14.56
- F1: 39.53
Model description
See google/mt5-base for the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
Intended uses & limitations
This model can be used for question generation on German text.
Training and evaluation data
See lmqg/qg_dequad.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5.5131 | 1.0 | 145 | 1.8698 | 6032 | 1668 | 626 | 216 | 16023 | 13819 | 11615 | 9411 | 37.6459 | 12.0703 | 5.3896 | 2.2952 | 0.7216 | 16023 | 21250 | 0.2485 | 0.1011 | 0.2368 | 0.2366 | 0.0018 | 6.2485 | 12.6166 | 0.2406 |
2.3946 | 2.0 | 291 | 1.5888 | 7325 | 2554 | 1178 | 558 | 16853 | 14649 | 12445 | 10241 | 43.4641 | 17.4346 | 9.4656 | 5.4487 | 0.7704 | 16853 | 21250 | 0.3226 | 0.1585 | 0.31 | 0.31 | 0.0145 | 10.8315 | 12.2582 | 0.3148 |
2.0101 | 3.0 | 436 | 1.4997 | 7623 | 2764 | 1304 | 629 | 17042 | 14838 | 12634 | 10430 | 44.7307 | 18.6278 | 10.3214 | 6.0307 | 0.7812 | 17042 | 21250 | 0.3403 | 0.1723 | 0.3263 | 0.3263 | 0.0154 | 11.7891 | 12.6783 | 0.3315 |
1.8073 | 4.0 | 582 | 1.4610 | 7728 | 2916 | 1415 | 707 | 16654 | 14450 | 12246 | 10042 | 46.4033 | 20.1799 | 11.5548 | 7.0404 | 0.7588 | 16654 | 21250 | 0.3461 | 0.1818 | 0.3324 | 0.3326 | 0.0168 | 12.6068 | 12.2963 | 0.3387 |
1.6851 | 4.99 | 727 | 1.4357 | 7964 | 3059 | 1483 | 727 | 17381 | 15177 | 12973 | 10769 | 45.8201 | 20.1555 | 11.4314 | 6.7509 | 0.8004 | 17381 | 21250 | 0.3558 | 0.1888 | 0.3415 | 0.3414 | 0.0159 | 13.0784 | 12.7436 | 0.3483 |
1.5642 | 6.0 | 873 | 1.4003 | 8299 | 3224 | 1592 | 788 | 17351 | 15147 | 12943 | 10739 | 47.8301 | 21.2847 | 12.3001 | 7.3377 | 0.7987 | 17351 | 21250 | 0.3814 | 0.2025 | 0.3684 | 0.3685 | 0.0204 | 13.9065 | 12.9569 | 0.3736 |
1.4756 | 6.99 | 1018 | 1.3779 | 8640 | 3430 | 1712 | 879 | 17669 | 15465 | 13261 | 11057 | 48.8992 | 22.1791 | 12.91 | 7.9497 | 0.8165 | 17669 | 21250 | 0.3971 | 0.2133 | 0.3828 | 0.3826 | 0.025 | 14.9146 | 13.1084 | 0.3892 |
1.3792 | 8.0 | 1164 | 1.3624 | 8732 | 3417 | 1712 | 871 | 17996 | 15792 | 13588 | 11384 | 48.5219 | 21.6375 | 12.5994 | 7.6511 | 0.8346 | 17996 | 21250 | 0.4003 | 0.2131 | 0.3852 | 0.3849 | 0.0245 | 14.8859 | 13.3748 | 0.3917 |
1.3133 | 9.0 | 1310 | 1.3630 | 8804 | 3500 | 1754 | 920 | 17661 | 15457 | 13253 | 11049 | 49.85 | 22.6435 | 13.2347 | 8.3265 | 0.8161 | 17661 | 21250 | 0.4078 | 0.219 | 0.3932 | 0.3935 | 0.025 | 15.3264 | 13.2019 | 0.4 |
1.261 | 10.0 | 1455 | 1.3685 | 8910 | 3602 | 1849 | 1000 | 17709 | 15505 | 13301 | 11097 | 50.3134 | 23.2312 | 13.9012 | 9.0114 | 0.8188 | 17709 | 21250 | 0.4135 | 0.223 | 0.3991 | 0.3992 | 0.0295 | 16.0163 | 13.1892 | 0.4055 |
1.1897 | 11.0 | 1601 | 1.3639 | 9096 | 3690 | 1902 | 1012 | 18261 | 16057 | 13853 | 11649 | 49.8111 | 22.9806 | 13.7299 | 8.6874 | 0.849 | 18261 | 21250 | 0.4201 | 0.2289 | 0.4059 | 0.4057 | 0.0281 | 16.3202 | 13.5077 | 0.4121 |
1.1453 | 11.99 | 1746 | 1.3610 | 9106 | 3735 | 1932 | 1023 | 18329 | 16125 | 13921 | 11717 | 49.6808 | 23.1628 | 13.8783 | 8.7309 | 0.8527 | 18329 | 21250 | 0.4173 | 0.2303 | 0.4026 | 0.4025 | 0.0281 | 16.4772 | 13.8013 | 0.4099 |
1.0858 | 13.0 | 1892 | 1.3716 | 9245 | 3778 | 1955 | 1049 | 18556 | 16352 | 14148 | 11944 | 49.8222 | 23.1042 | 13.8182 | 8.7827 | 0.8649 | 18556 | 21250 | 0.4244 | 0.2327 | 0.409 | 0.409 | 0.0322 | 16.7204 | 13.8144 | 0.417 |
1.0472 | 13.99 | 2037 | 1.3770 | 9166 | 3756 | 1946 | 1054 | 18315 | 16111 | 13907 | 11703 | 50.0464 | 23.3133 | 13.993 | 9.0062 | 0.8519 | 18315 | 21250 | 0.4216 | 0.2311 | 0.4068 | 0.4067 | 0.0309 | 16.6825 | 13.8099 | 0.4143 |
0.9953 | 15.0 | 2183 | 1.3881 | 9342 | 3926 | 2046 | 1108 | 18132 | 15928 | 13724 | 11520 | 51.5222 | 24.6484 | 14.9082 | 9.6181 | 0.842 | 18132 | 21250 | 0.4328 | 0.2418 | 0.4171 | 0.4171 | 0.0327 | 17.3937 | 13.5023 | 0.4258 |
0.9509 | 16.0 | 2329 | 1.4016 | 9330 | 3894 | 2024 | 1084 | 18672 | 16468 | 14264 | 12060 | 49.9679 | 23.6459 | 14.1896 | 8.9884 | 0.871 | 18672 | 21250 | 0.4269 | 0.237 | 0.4123 | 0.4122 | 0.0313 | 17.1618 | 13.956 | 0.4198 |
0.9183 | 17.0 | 2474 | 1.4152 | 9303 | 3824 | 1979 | 1084 | 18476 | 16272 | 14068 | 11864 | 50.3518 | 23.5005 | 14.0674 | 9.1369 | 0.8606 | 18476 | 21250 | 0.4269 | 0.2345 | 0.4121 | 0.4122 | 0.0327 | 16.995 | 13.7854 | 0.4199 |
0.8696 | 18.0 | 2620 | 1.4404 | 9184 | 3798 | 1993 | 1085 | 18379 | 16175 | 13971 | 11767 | 49.9701 | 23.4807 | 14.2653 | 9.2207 | 0.8554 | 18379 | 21250 | 0.4218 | 0.2333 | 0.4076 | 0.4074 | 0.034 | 16.9541 | 13.726 | 0.4148 |
0.8389 | 19.0 | 2765 | 1.4360 | 9476 | 4000 | 2092 | 1139 | 19003 | 16799 | 14595 | 12391 | 49.8658 | 23.8109 | 14.3337 | 9.1922 | 0.8885 | 19003 | 21250 | 0.4307 | 0.2406 | 0.4161 | 0.416 | 0.0299 | 17.67 | 14.2064 | 0.4239 |
0.7993 | 19.92 | 2900 | 1.4545 | 9464 | 3970 | 2078 | 1126 | 18741 | 16537 | 14333 | 12129 | 50.4989 | 24.0068 | 14.498 | 9.2835 | 0.8747 | 18741 | 21250 | 0.4349 | 0.2424 | 0.4194 | 0.4192 | 0.0327 | 17.5799 | 13.9959 | 0.4269 |
Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3