hiba2
/

results_arat5-3_wiki

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

results_arat5-3_wiki / README.md

hiba2's picture

End of training

baeda9e verified 6 months ago

|

3.62 kB

	---
	base_model: hiba2/results_arat5-2_wiki
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: results_arat5-3_wiki
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# results_arat5-3_wiki

	This model is a fine-tuned version of [hiba2/results_arat5-2_wiki](https://huggingface.co/hiba2/results_arat5-2_wiki) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.7821
	- Rouge1: 0.0926
	- Rouge2: 0.0015
	- Rougel: 0.0934
	- Rougelsum: 0.0928
	- Gen Len: 19.0

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Gen Len \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-------:\|:-----:\|:-------:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 7.8936 \| 0.9506 \| 500 \| 0.0 \| 5.9107 \| 0.0 \| 0.0 \| 0.0 \| 0.0 \|
	\| 5.5649 \| 1.9011 \| 1000 \| 18.8876 \| 4.9336 \| 0.0905 \| 0.0 \| 0.0915 \| 0.0912 \|
	\| 4.9098 \| 2.8517 \| 1500 \| 18.8989 \| 4.3731 \| 0.0905 \| 0.0 \| 0.0915 \| 0.0912 \|
	\| 4.4486 \| 3.8023 \| 2000 \| 19.0 \| 3.9340 \| 0.0875 \| 0.0 \| 0.0885 \| 0.0882 \|
	\| 4.0755 \| 4.7529 \| 2500 \| 18.9382 \| 3.5412 \| 0.0881 \| 0.0 \| 0.0891 \| 0.0887 \|
	\| 3.6998 \| 5.7034 \| 3000 \| 18.8783 \| 3.1344 \| 0.095 \| 0.0002 \| 0.0958 \| 0.0954 \|
	\| 3.3129 \| 6.6540 \| 3500 \| 18.8408 \| 2.8528 \| 0.0935 \| 0.0013 \| 0.0945 \| 0.094 \|
	\| 3.1053 \| 7.6046 \| 4000 \| 18.9382 \| 2.6196 \| 0.0936 \| 0.0008 \| 0.0946 \| 0.0941 \|
	\| 2.8412 \| 8.5551 \| 4500 \| 18.867 \| 2.4414 \| 0.091 \| 0.0011 \| 0.0919 \| 0.0915 \|
	\| 2.702 \| 9.5057 \| 5000 \| 18.8783 \| 2.2952 \| 0.0936 \| 0.001 \| 0.0948 \| 0.0946 \|
	\| 2.5611 \| 10.4563 \| 5500 \| 19.0 \| 2.1816 \| 0.093 \| 0.0011 \| 0.0941 \| 0.0936 \|
	\| 2.4499 \| 11.4068 \| 6000 \| 18.8502 \| 2.0914 \| 0.0988 \| 0.0011 \| 0.0995 \| 0.099 \|
	\| 2.3764 \| 12.3574 \| 6500 \| 18.8371 \| 2.0264 \| 0.0992 \| 0.0016 \| 0.0997 \| 0.0995 \|
	\| 2.3172 \| 13.3080 \| 7000 \| 18.9888 \| 1.9853 \| 0.098 \| 0.0015 \| 0.099 \| 0.0986 \|
	\| 2.2794 \| 14.2586 \| 7500 \| 18.9888 \| 1.9615 \| 0.0971 \| 0.0023 \| 0.0977 \| 0.0976 \|
	\| 2.2178 \| 15.2091 \| 8000 \| 1.9424 \| 0.0961 \| 0.0009 \| 0.0972 \| 0.0968 \| 19.0 \|
	\| 2.2378 \| 16.1597 \| 8500 \| 1.8855 \| 0.0935 \| 0.0011 \| 0.0942 \| 0.0937 \| 19.0 \|
	\| 2.1573 \| 17.1103 \| 9000 \| 1.8386 \| 0.0952 \| 0.0009 \| 0.0962 \| 0.0958 \| 19.0 \|
	\| 2.132 \| 18.0608 \| 9500 \| 1.8055 \| 0.0919 \| 0.0012 \| 0.0929 \| 0.0923 \| 18.8783 \|
	\| 2.1035 \| 19.0114 \| 10000 \| 1.7863 \| 0.0942 \| 0.0015 \| 0.0949 \| 0.0945 \| 19.0 \|
	\| 2.0818 \| 19.9620 \| 10500 \| 1.7821 \| 0.0926 \| 0.0015 \| 0.0934 \| 0.0928 \| 19.0 \|


	### Framework versions

	- Transformers 4.42.0.dev0
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.1
	- Tokenizers 0.19.1