asahi417 commited on
Commit
9c14f5a
1 Parent(s): 9b63762

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -16
README.md CHANGED
@@ -14,24 +14,23 @@ The input must be Japanese speech, while the translation can be in any languages
14
  [here](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200).
15
 
16
  ## Benchmark
17
- The folloiwng table shows CER computed over the reference and predicted translation for translating Japanese speech to English text task
18
- (subsets of [CoVoST2 and Fleurs](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation)). We benchmark different size of NLLB and we can confirm that
19
- the distilled model (`facebook/nllb-200-distilled-600M`) already achieved compatitive result. Also, none of the public OpenAI whisper models are capable of translating
20
- Japanese speech to English.
21
 
22
- | model | [Translation (En->Ja)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation) (covost2) | [Translation (En->Ja)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation) (fleurs) |
23
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
24
- | [japanese-asr/en-cascaded-s2t-translation](https://huggingface.co/japanese-asr/en-cascaded-s2t-translation) ([facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)) | 62.4 | 63.5 |
25
- | [japanese-asr/en-cascaded-s2t-translation](https://huggingface.co/japanese-asr/en-cascaded-s2t-translation) ([facebook/nllb-200-1.3B](https://huggingface.co/facebook/nllb-200-1.3B)) | 64.4 | 67.2 |
26
- | [japanese-asr/en-cascaded-s2t-translation](https://huggingface.co/japanese-asr/en-cascaded-s2t-translation) ([facebook/nllb-200-distilled-1.3B](https://huggingface.co/facebook/nllb-200-distilled-1.3B)) | 62.4 | 62.9 |
27
- | [japanese-asr/en-cascaded-s2t-translation](https://huggingface.co/japanese-asr/en-cascaded-s2t-translation) ([facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M)) | 63.4 | 66.2 |
28
- | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 178.9 | 209.5 |
29
- | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 179.6 | 201.8 |
30
- | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 178.7 | 201.8 |
31
- | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 178.7 | 202 |
32
- | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 179.5 | 214.2 |
33
- | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 178.9 | 206.8 |
34
- | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 185.2 | 200.5 |
 
35
 
36
  See [https://github.com/kotoba-tech/kotoba-whisper](https://github.com/kotoba-tech/kotoba-whisper) for the evaluation detail.
37
 
 
14
  [here](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200).
15
 
16
  ## Benchmark
17
+ The folloiwng table shows WER computed over the reference and predicted translation for translating Japanse speech to English text task
18
+ (subsets of [CoVoST2 and Fleurs](https://huggingface.co/datasets/japanese-asr/ja2en.s2t_translation)) with different size of NLLB along with OpenAI Whisper models.
 
 
19
 
20
+ | model | [CoVoST2 (En->Ja)](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation)| [Fleurs](https://huggingface.co/datasets/japanese-asr/en2ja.s2t_translation) |
21
  |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------:|------------------------------------------------------------------------------------------------------:|
22
+ | [japanese-asr/ja-cascaded-s2t-translation](https://huggingface.co/japanese-asr/ja-cascaded-s2t-translation) ([facebook/nllb-200-3.3B](https://huggingface.co/facebook/nllb-200-3.3B)) | 64.3 | 67.1 |
23
+ | [japanese-asr/ja-cascaded-s2t-translation](https://huggingface.co/japanese-asr/ja-cascaded-s2t-translation) ([facebook/nllb-200-1.3B](https://huggingface.co/facebook/nllb-200-1.3B)) | 65.4 | 68.9 |
24
+ | [japanese-asr/ja-cascaded-s2t-translation](https://huggingface.co/japanese-asr/ja-cascaded-s2t-translation) ([facebook/nllb-200-distilled-1.3B](https://huggingface.co/facebook/nllb-200-distilled-1.3B)) | 65.6 | 67.4 |
25
+ | [japanese-asr/ja-cascaded-s2t-translation](https://huggingface.co/japanese-asr/ja-cascaded-s2t-translation) ([facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M)) | 68.2 | 72.2 |
26
+ | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 71 | 86.1 |
27
+ | [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) | 66.4 | 78.8 |
28
+ | [openai/whisper-large](https://huggingface.co/openai/whisper-large) | 66.5 | 86.1 |
29
+ | [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) | 70.3 | 97.2 |
30
+ | [openai/whisper-small](https://huggingface.co/openai/whisper-small) | 97.3 | 132.2 |
31
+ | [openai/whisper-base](https://huggingface.co/openai/whisper-base) | 186.2 | 349.6 |
32
+ | [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) | 377.2 | 474 |
33
+
34
 
35
  See [https://github.com/kotoba-tech/kotoba-whisper](https://github.com/kotoba-tech/kotoba-whisper) for the evaluation detail.
36