librarian-bot's picture
Librarian Bot: Add base_model information to model
09b6557
|
raw
history blame
2.95 kB
metadata
language:
  - ru
  - kbd
license: mit
tags:
  - translation
datasets:
  - anzorq/kbd-ru
widget:
  - text: Я иду домой.
    example_title: Я иду домой.
  - text: Дети играют во дворе.
    example_title: Дети играют во дворе.
  - text: Сколько тебе лет?
    example_title: Сколько тебе лет?
  - text: На следующий день мы отправились в путь.
    example_title: На следующий день мы отправились в путь.
base_model: facebook/m2m100_418M

m2m100_ru_kbd_44K

This model is a fine-tuned version of facebook/m2m100_418M on a ru-kbd dataset, containing 44K sentences from books, textbooks, dictionaries etc.. It achieves the following results on the evaluation set:

  • Loss: 0.9399
  • Bleu: 22.389
  • Gen Len: 16.562

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.2391 0.18 1000 1.9921 7.4066 16.377
1.8436 0.36 2000 1.6756 9.3443 18.428
1.63 0.53 3000 1.5361 10.9057 17.134
1.5205 0.71 4000 1.3994 12.6061 17.471
1.4471 0.89 5000 1.3107 14.4452 16.985
1.1915 1.07 6000 1.2462 15.1903 16.544
1.1165 1.25 7000 1.1917 16.3859 17.044
1.0654 1.43 8000 1.1351 17.617 16.481
1.0464 1.6 9000 1.0939 18.649 16.517
1.0376 1.78 10000 1.0603 18.2567 17.152
1.0027 1.96 11000 1.0184 20.6011 16.875
0.7741 2.14 12000 1.0159 20.4801 16.488
0.7566 2.32 13000 0.9899 21.6967 16.681
0.7346 2.49 14000 0.9738 21.8249 16.679
0.7397 2.67 15000 0.9555 21.569 16.608
0.6919 2.85 16000 0.9441 22.4658 16.493

Framework versions

  • Transformers 4.21.0
  • Pytorch 1.10.0+cu113
  • Datasets 2.4.0
  • Tokenizers 0.12.1