anzorq
/

m2m100_418M_ft_ru-kbd_44K

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

anzorq commited on Aug 25, 2022

Commit

38984a6

•

1 Parent(s): f91e37f

Update README.md

Files changed (1) hide show

README.md +69 -12

README.md CHANGED Viewed

@@ -1,15 +1,72 @@
 ---
 license: mit
 tags:
-- translation
-language:
-- ru
-- kbd
-widget:
-- text: "Я иду домой."
-  example_title: "Я иду домой."
-- text: "Дети играют во дворе."
-  example_title: "Дети играют во дворе."
-- text: "Сколько тебе лет?"
-  example_title: "Сколько тебе лет?"
----

 ---
 license: mit
 tags:
+- generated_from_trainer
+metrics:
+- bleu
+model-index:
+- name: m2m100_418M-finetuned-ko-to-en4
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# m2m100_418M-finetuned-ko-to-en4
+This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4598
+- Bleu: 85.3745
+- Gen Len: 9.7522
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.001
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- gradient_accumulation_steps: 256
+- total_train_batch_size: 1024
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 10
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
+| No log        | 1.0   | 105  | 1.8667          | 24.5072 | 9.523   |
+| No log        | 2.0   | 210  | 0.8581          | 57.9973 | 9.2779  |
+| No log        | 3.0   | 315  | 0.6587          | 69.4588 | 9.7399  |
+| No log        | 4.0   | 420  | 0.5762          | 74.5636 | 9.6775  |
+| 1.4539        | 5.0   | 525  | 0.5254          | 78.8897 | 9.6946  |
+| 1.4539        | 6.0   | 630  | 0.4952          | 81.0054 | 9.7073  |
+| 1.4539        | 7.0   | 735  | 0.4773          | 83.0792 | 9.7233  |
+| 1.4539        | 8.0   | 840  | 0.4669          | 84.4309 | 9.7429  |
+| 1.4539        | 9.0   | 945  | 0.4616          | 85.0965 | 9.749   |
+| 0.144         | 10.0  | 1050 | 0.4598          | 85.3745 | 9.7522  |
+### Framework versions
+- Transformers 4.18.0
+- Pytorch 1.11.0+cu113
+- Datasets 2.1.0
+- Tokenizers 0.12.1