leobertolazzi
/

medieval-it5-base

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

medieval-it5-base / README.md

leobertolazzi's picture

Update README.md

a9e2f12 almost 2 years ago

|

history blame contribute delete

1.76 kB

	---
	model-index:
	- name: medieval-it5-base
	results: []
	language:
	- it
	---

	# medieval-it5-base

	This model is a version of [gsarti/it5-base](https://huggingface.co/gsarti/it5-base) fine-tuned on a dataset called [ita2medieval](https://huggingface.co/datasets/leobertolazzi/ita2medieval). The Dataset contains sentences from medieval italian along with paraphrases in contemporary italian (approximately 6.5k pairs in total).

	The fine-tuning task is text-style-tansfer from contemporary to medieval italian.


	## Using the model

	```
	from transformers import AutoTokenzier, AutoModelForSeq2SeqLM
	tokenizer = AutoTokenizer.from_pretrained("leobertolazzi/medieval-it5-base")
	model = AutoModelForSeq2SeqLM.from_pretrained("leobertolazzi/medieval-it5-base")
	```

	Flax and Tensorflow versions of the model are also available:
	```
	from transformers import FlaxT5ForConditionalGeneration, TFT5ForConditionalGeneration
	model_flax = FlaxT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
	model_tf = TFT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
	```

	## Training procedure

	The code used for the fine-tuning is available in this [repo](https://github.com/leobertolazzi/medievalIT5)

	## Intended uses & limitations

	The biggest limitation for this project is the size of the ita2medieval dataset. In fact, it consists only of 6.5K sentence pairs whereas [gsarti/it5-base](https://huggingface.co/gsarti/it5-base) has 220M parameters.

	For this reason the results can be far from perfect, but some nice style translations can also be obtained.

	It would be nice to expand ita2medieval with text and paraphrases from more medieval italian authors!

	### Framework versions

	- Transformers 4.26.0
	- Tokenizers 0.13.2