Kumshe
/

t5-small-finetuned-english-to-hausa

text2text-generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

t5-small-finetuned-english-to-hausa / README.md

Kumshe's picture

Training complete

db211ad verified about 1 month ago

|

No virus

3.64 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: google-t5/t5-small
	tags:
	- translation
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: t5-small-finetuned-english-to-hausa
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-small-finetuned-english-to-hausa

	This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7088
	- Bleu: 71.7187
	- Gen Len: 14.3652

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0008
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 3000
	- num_epochs: 30
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Gen Len \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|:-------:\|
	\| 3.1612 \| 1.0 \| 749 \| 1.7523 \| 32.7424 \| 15.2302 \|
	\| 1.5573 \| 2.0 \| 1498 \| 1.0553 \| 53.4401 \| 14.5568 \|
	\| 1.0462 \| 3.0 \| 2247 \| 0.7899 \| 60.8893 \| 14.71 \|
	\| 0.8071 \| 4.0 \| 2996 \| 0.6780 \| 64.3438 \| 14.4066 \|
	\| 0.6602 \| 5.0 \| 3745 \| 0.6089 \| 66.0887 \| 14.127 \|
	\| 0.5562 \| 6.0 \| 4494 \| 0.5741 \| 66.8902 \| 14.1295 \|
	\| 0.4872 \| 7.0 \| 5243 \| 0.5497 \| 68.4261 \| 14.3395 \|
	\| 0.4299 \| 8.0 \| 5992 \| 0.5412 \| 68.9385 \| 14.3446 \|
	\| 0.3872 \| 9.0 \| 6741 \| 0.5377 \| 69.5675 \| 14.2603 \|
	\| 0.3478 \| 10.0 \| 7490 \| 0.5356 \| 70.0045 \| 14.3615 \|
	\| 0.3147 \| 11.0 \| 8239 \| 0.5312 \| 70.1895 \| 14.4524 \|
	\| 0.2848 \| 12.0 \| 8988 \| 0.5484 \| 70.8151 \| 14.366 \|
	\| 0.2584 \| 13.0 \| 9737 \| 0.5523 \| 70.6127 \| 14.2939 \|
	\| 0.2342 \| 14.0 \| 10486 \| 0.5642 \| 70.7368 \| 14.3301 \|
	\| 0.2122 \| 15.0 \| 11235 \| 0.5775 \| 70.9399 \| 14.3635 \|
	\| 0.1928 \| 16.0 \| 11984 \| 0.5935 \| 71.2577 \| 14.352 \|
	\| 0.1757 \| 17.0 \| 12733 \| 0.5964 \| 71.2056 \| 14.3929 \|
	\| 0.1608 \| 18.0 \| 13482 \| 0.6085 \| 71.0265 \| 14.3877 \|
	\| 0.1475 \| 19.0 \| 14231 \| 0.6219 \| 71.5491 \| 14.3812 \|
	\| 0.1352 \| 20.0 \| 14980 \| 0.6285 \| 71.5971 \| 14.3675 \|
	\| 0.1237 \| 21.0 \| 15729 \| 0.6468 \| 71.4863 \| 14.3782 \|
	\| 0.1142 \| 22.0 \| 16478 \| 0.6652 \| 71.5849 \| 14.3734 \|
	\| 0.1082 \| 23.0 \| 17227 \| 0.6733 \| 71.6037 \| 14.3298 \|
	\| 0.0998 \| 24.0 \| 17976 \| 0.6852 \| 71.6926 \| 14.4066 \|
	\| 0.0962 \| 25.0 \| 18725 \| 0.6899 \| 71.7003 \| 14.358 \|
	\| 0.0915 \| 26.0 \| 19474 \| 0.6994 \| 71.6191 \| 14.3702 \|
	\| 0.0882 \| 27.0 \| 20223 \| 0.7033 \| 71.5731 \| 14.3537 \|
	\| 0.0857 \| 28.0 \| 20972 \| 0.7084 \| 71.6407 \| 14.3618 \|
	\| 0.0853 \| 29.0 \| 21721 \| 0.7086 \| 71.7115 \| 14.3635 \|
	\| 0.0847 \| 30.0 \| 22470 \| 0.7088 \| 71.7187 \| 14.3652 \|


	### Framework versions

	- Transformers 4.44.2
	- Pytorch 2.4.0+cu121
	- Datasets 2.21.0
	- Tokenizers 0.19.1