text_shortening_model_v2

This model is a fine-tuned version of t5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.4449
Rouge1: 0.581
Rouge2: 0.3578
Rougel: 0.5324
Rougelsum: 0.5317
Bert precision: 0.8885
Bert recall: 0.8981
Average word count: 11.5929
Max word count: 17
Min word count: 3
Average token count: 16.7071

Model description

No "summarize" prefix

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Bert precision	Bert recall	Average word count	Max word count	Min word count	Average token count
1.7498	1.0	8	1.9424	0.4725	0.2644	0.4207	0.4216	0.8343	0.8502	11.7357	18	0	17.5143
1.5236	2.0	16	1.7731	0.5185	0.2961	0.4661	0.4665	0.8566	0.8646	11.05	18	0	16.6143
1.4381	3.0	24	1.6880	0.5459	0.3212	0.4947	0.4942	0.8773	0.8862	11.5857	18	3	16.8143
1.3895	4.0	32	1.6405	0.5537	0.3275	0.506	0.5061	0.8815	0.8894	11.7	18	3	16.6571
1.353	5.0	40	1.5941	0.5579	0.3347	0.5124	0.5119	0.8839	0.8933	11.7643	18	4	16.7429
1.3026	6.0	48	1.5568	0.5585	0.3379	0.5132	0.5129	0.8823	0.8945	11.9714	18	4	16.95
1.2624	7.0	56	1.5359	0.5696	0.3466	0.5202	0.5195	0.8837	0.897	12.0143	18	5	17.1143
1.2481	8.0	64	1.5186	0.5736	0.3517	0.5241	0.523	0.8849	0.898	12.0214	17	6	17.1714
1.2089	9.0	72	1.5055	0.5732	0.3499	0.5256	0.5246	0.8846	0.8979	12.0357	17	5	17.2214
1.1845	10.0	80	1.4898	0.5761	0.3548	0.5284	0.5276	0.886	0.8977	11.9	17	5	17.0786
1.1882	11.0	88	1.4787	0.5768	0.3573	0.5291	0.5288	0.8862	0.8986	11.8071	17	5	17.05
1.1649	12.0	96	1.4720	0.5784	0.3592	0.5319	0.531	0.8868	0.8988	11.7786	17	5	17.0
1.1643	13.0	104	1.4637	0.5785	0.3592	0.5314	0.5308	0.8875	0.8977	11.6571	17	3	16.8214
1.129	14.0	112	1.4565	0.5794	0.3585	0.5324	0.5315	0.8883	0.8984	11.6571	17	3	16.8
1.136	15.0	120	1.4516	0.5826	0.3598	0.537	0.5363	0.8898	0.8995	11.5857	17	3	16.6786
1.1191	16.0	128	1.4491	0.5828	0.3579	0.5357	0.535	0.8895	0.899	11.5929	17	3	16.6857
1.1192	17.0	136	1.4471	0.5794	0.355	0.5312	0.5307	0.8883	0.898	11.6143	17	3	16.7286
1.1085	18.0	144	1.4456	0.5808	0.3557	0.5315	0.5307	0.8883	0.8982	11.6286	17	3	16.7429
1.1063	19.0	152	1.4451	0.5808	0.3571	0.5321	0.5314	0.8884	0.8981	11.6	17	3	16.7143
1.0965	20.0	160	1.4449	0.581	0.3578	0.5324	0.5317	0.8885	0.8981	11.5929	17	3	16.7071

Framework versions

Transformers 4.32.1
Pytorch 2.0.1+cu118
Datasets 2.14.4
Tokenizers 0.13.3

ldos
/

text_shortening_model_v2