guocheng98
/

HelsinkiNLP-FineTuned-Legal-es-zh

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

HelsinkiNLP-FineTuned-Legal-es-zh / README.md

guocheng98's picture

Update README.md

59b46b0 over 3 years ago

|

history blame contribute delete

3.05 kB

	---
	language:
	- es
	- zh

	tags:
	- translation

	license: apache-2.0
	---

	This model is a fine-tuned version of [Helsinki-NLP/opus-tatoeba-es-zh](https://huggingface.co/Helsinki-NLP/opus-tatoeba-es-zh) on a dataset of legal domain constructed by the author himself.


	# Intended uses & limitations

	This model is the result of the master graduation thesis for the Tradumatics: Translation Technologies program at the Autonomous University of Barcelona.

	Please refer to the GitHub repo created for this thesis for the full-text and relative open-sourced materials: https://github.com/guocheng98/MUTTT2020_TFM_ZGC

	The thesis intends to explain various theories and certain algorithm details about neural machine translation, thus this fine-tuned model only serves as a hands-on practice example for that objective, without any intention of productive usage.


	# Training and evaluation data

	The dataset is constructed from the Chinese translation of Spanish Civil Code, Spanish Constitution, and many other laws & regulations found in the database China Law Info (北大法宝 Beida Fabao), along with their source text found on Boletín Oficial del Estado and EUR-Lex.

	There are 9972 sentence pairs constructed. 1000 are used for evaluation and the rest for training.


	# Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 2000
	- num_epochs: 10
	- mixed_precision_training: Native AMP
	- weight_decay: 0.01
	- early_stopping_patience: 8


	# Training results

	Best validation loss achieved at step 5600.

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.9584 \| 0.36 \| 400 \| 2.6800 \|
	\| 2.6402 \| 0.71 \| 800 \| 2.5017 \|
	\| 2.5038 \| 1.07 \| 1200 \| 2.3907 \|
	\| 2.3279 \| 1.43 \| 1600 \| 2.2999 \|
	\| 2.2258 \| 1.78 \| 2000 \| 2.2343 \|
	\| 2.1061 \| 2.14 \| 2400 \| 2.1961 \|
	\| 1.9279 \| 2.5 \| 2800 \| 2.1569 \|
	\| 1.9059 \| 2.85 \| 3200 \| 2.1245 \|
	\| 1.7491 \| 3.21 \| 3600 \| 2.1227 \|
	\| 1.6301 \| 3.57 \| 4000 \| 2.1169 \|
	\| 1.6871 \| 3.92 \| 4400 \| 2.0979 \|
	\| 1.5203 \| 4.28 \| 4800 \| 2.1074 \|
	\| 1.4646 \| 4.63 \| 5200 \| 2.1024 \|
	\| 1.4739 \| 4.99 \| 5600 \| 2.0905 \|
	\| 1.338 \| 5.35 \| 6000 \| 2.0946 \|
	\| 1.3152 \| 5.7 \| 6400 \| 2.0974 \|
	\| 1.306 \| 6.06 \| 6800 \| 2.0985 \|
	\| 1.1991 \| 6.42 \| 7200 \| 2.0962 \|
	\| 1.2113 \| 6.77 \| 7600 \| 2.1092 \|
	\| 1.1983 \| 7.13 \| 8000 \| 2.1060 \|
	\| 1.1238 \| 7.49 \| 8400 \| 2.1102 \|
	\| 1.1417 \| 7.84 \| 8800 \| 2.1078 \|


	# Framework versions

	- Transformers 4.7.0
	- Pytorch 1.8.1+cu101
	- Datasets 1.8.0
	- Tokenizers 0.10.3