ptt5-wikilingua-30epochs / README.md

arthurmluz

End of training

592f067 12 months ago

preview code

raw

history blame

No virus

4.57 kB

	---
	license: mit
	base_model: unicamp-dl/ptt5-base-portuguese-vocab
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: ptt5-wikilingua-30epochs
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# ptt5-wikilingua-30epochs

	This model is a fine-tuned version of [unicamp-dl/ptt5-base-portuguese-vocab](https://huggingface.co/unicamp-dl/ptt5-base-portuguese-vocab) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.9063
	- Rouge1: 0.2604
	- Rouge2: 0.1127
	- Rougel: 0.2222
	- Rougelsum: 0.2541
	- Gen Len: 18.4528

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:-------:\|
	\| 2.1668 \| 1.0 \| 28580 \| 2.0384 \| 0.2366 \| 0.0935 \| 0.2034 \| 0.2311 \| 18.2195 \|
	\| 2.0348 \| 2.0 \| 57160 \| 1.9725 \| 0.2448 \| 0.0998 \| 0.2098 \| 0.2391 \| 18.3898 \|
	\| 2.0152 \| 3.0 \| 85740 \| 1.9346 \| 0.2469 \| 0.1024 \| 0.2122 \| 0.2414 \| 18.2427 \|
	\| 1.9769 \| 4.0 \| 114320 \| 1.9096 \| 0.2503 \| 0.1047 \| 0.2147 \| 0.2446 \| 18.2773 \|
	\| 1.8471 \| 5.0 \| 142900 \| 1.8957 \| 0.253 \| 0.1076 \| 0.2169 \| 0.2473 \| 18.2612 \|
	\| 1.8504 \| 6.0 \| 171480 \| 1.8840 \| 0.2541 \| 0.1084 \| 0.2179 \| 0.2483 \| 18.3317 \|
	\| 1.7456 \| 7.0 \| 200060 \| 1.8768 \| 0.2547 \| 0.1084 \| 0.2183 \| 0.2488 \| 18.3634 \|
	\| 1.7254 \| 8.0 \| 228640 \| 1.8747 \| 0.2563 \| 0.1099 \| 0.2196 \| 0.2505 \| 18.3577 \|
	\| 1.7742 \| 9.0 \| 257220 \| 1.8739 \| 0.2562 \| 0.11 \| 0.2194 \| 0.2504 \| 18.3904 \|
	\| 1.7211 \| 10.0 \| 285800 \| 1.8667 \| 0.2572 \| 0.1109 \| 0.2205 \| 0.2513 \| 18.3616 \|
	\| 1.696 \| 11.0 \| 314380 \| 1.8677 \| 0.2568 \| 0.1112 \| 0.2204 \| 0.251 \| 18.349 \|
	\| 1.6762 \| 12.0 \| 342960 \| 1.8695 \| 0.2571 \| 0.1108 \| 0.2202 \| 0.2513 \| 18.3528 \|
	\| 1.6404 \| 13.0 \| 371540 \| 1.8738 \| 0.2582 \| 0.1115 \| 0.2208 \| 0.2523 \| 18.3909 \|
	\| 1.6523 \| 14.0 \| 400120 \| 1.8727 \| 0.259 \| 0.1118 \| 0.2215 \| 0.253 \| 18.4077 \|
	\| 1.626 \| 15.0 \| 428700 \| 1.8736 \| 0.2596 \| 0.1124 \| 0.2222 \| 0.2537 \| 18.4245 \|
	\| 1.5922 \| 16.0 \| 457280 \| 1.8750 \| 0.259 \| 0.1123 \| 0.2215 \| 0.253 \| 18.4125 \|
	\| 1.5345 \| 17.0 \| 485860 \| 1.8783 \| 0.2591 \| 0.112 \| 0.2214 \| 0.2529 \| 18.4013 \|
	\| 1.5785 \| 18.0 \| 514440 \| 1.8797 \| 0.2588 \| 0.112 \| 0.2212 \| 0.2527 \| 18.3965 \|
	\| 1.5097 \| 19.0 \| 543020 \| 1.8868 \| 0.2592 \| 0.1115 \| 0.221 \| 0.2531 \| 18.4567 \|
	\| 1.5091 \| 20.0 \| 571600 \| 1.8851 \| 0.2593 \| 0.1124 \| 0.2216 \| 0.2533 \| 18.397 \|
	\| 1.5116 \| 21.0 \| 600180 \| 1.8895 \| 0.2599 \| 0.1124 \| 0.2219 \| 0.2537 \| 18.4505 \|
	\| 1.5351 \| 22.0 \| 628760 \| 1.8901 \| 0.2606 \| 0.113 \| 0.2225 \| 0.2544 \| 18.4369 \|
	\| 1.5125 \| 23.0 \| 657340 \| 1.8953 \| 0.2598 \| 0.1125 \| 0.2218 \| 0.2535 \| 18.4273 \|
	\| 1.5246 \| 24.0 \| 685920 \| 1.8980 \| 0.2609 \| 0.1129 \| 0.2226 \| 0.2544 \| 18.4464 \|
	\| 1.5113 \| 25.0 \| 714500 \| 1.8990 \| 0.2604 \| 0.1127 \| 0.2221 \| 0.2542 \| 18.4562 \|
	\| 1.4814 \| 26.0 \| 743080 \| 1.9029 \| 0.261 \| 0.1133 \| 0.223 \| 0.2547 \| 18.4634 \|
	\| 1.5212 \| 27.0 \| 771660 \| 1.9014 \| 0.2606 \| 0.1129 \| 0.2226 \| 0.2544 \| 18.4458 \|
	\| 1.4469 \| 28.0 \| 800240 \| 1.9032 \| 0.2609 \| 0.1129 \| 0.2226 \| 0.2546 \| 18.4577 \|
	\| 1.4844 \| 29.0 \| 828820 \| 1.9050 \| 0.2602 \| 0.1125 \| 0.2221 \| 0.2539 \| 18.4553 \|
	\| 1.4561 \| 30.0 \| 857400 \| 1.9063 \| 0.2604 \| 0.1127 \| 0.2222 \| 0.2541 \| 18.4528 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.5
	- Tokenizers 0.13.3