Sunbird
/

asr-mms-salt

@@ -8,68 +8,70 @@ metrics:
 model-index:
 - name: mms-lug
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# Sunbird - MMS Finetuned Models
-This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-To Add
-### Results
-| Language Adapter    | WER (%) | CER (%) | Additional Details  |
-|---------------------|--------:|--------:|---------------------|
-| **Luganda (Lug)**   |         |         |                     |
-| Lug-Base            |  0.25   |         |                     |
-| Lug+5Gram LM        |         |         |                     |
-| Lug+3Gram LM        |         |         |                     |
-| Lug+English Combined|  0.12   |         |                     |
-| **Acholi (Ach)**    |         |         |                     |
-| Ach-Base            |  0.34   |         |                     |
-| Ach+3Gram LM        |         |         |                     |
-| Ach+5Gram LM        |         |         |                     |
-| Ach+English Combined|  0.18   |         |                     |
-| **Lugbara (Lgg)**   |         |         |                     |
-| Lgg-Base            |         |         |                     |
-| Lgg+3Gram LM        |         |         |                     |
-| Lgg+5Gram LM        |         |         |                     |
-| Lgg+English Combined|  0.25   |         |                     |
-| **Teso (Teo)**      |         |         |                     |
-| Teo-Base            |  0.39   |         |                     |
-| Teo+3Gram LM        |         |         |                     |
-| Teo+5Gram LM        |         |         |                     |
-| Teo+English Combined|  0.29   |         |                     |
-| **Nyankore (Nyn)**  |         |         |                     |
-| Nyn-Base            |  0.48   |         |                     |
-| Nyn+3Gram LM        |         |         |                     |
-| Nyn+5Gram LM        |         |         |                     |
-| Nyn+English Combined|  0.29   |         |                     |
-_Note: LM stands for Language Model. The `+3Gram LM` and `+5Gram LM` suffixes indicate models enhanced with trigram and five-gram language models, respectively._
-### Framework versions
-- Transformers 4.32.0.dev0
-- Pytorch 2.0.1+cu117
-- Datasets 2.13.0
-- Tokenizers 0.13.3

 model-index:
 - name: mms-lug
   results: []
+datasets:
+- Sunbird/salt
+language:
+- lg
+- en
+- ach
+- teo
+- lgg
+- nyn
 ---
+# MMS speech recognition for Ugandan languages
+This is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all)
+for Ugandan languages, trained with the [SALT](https://huggingface.co/datasets/Sunbird/salt) dataset. The languages supported are:
+| code | language |
+| --- | --- |
+| lug | Luganda |
+| ach | Acholi |
+| lgg | Lugbara |
+| teo | Ateso |
+| nyn | Runyankole |
+For each  language there are two adapters: one optimised for cases where the speech is only in that language,
+and another in which code-switching with English is expected.
+# Usage
+Usage is the same as the base model, though with different adapters available.
+```python
+import torch
+import transformers
+import datasets
+# Available adapters:
+# ['lug', 'lug+eng', 'ach', 'ach+eng', 'lgg', 'lgg+eng',
+#  'nyn', 'nyn+eng', 'teo', 'teo+eng']
+language = 'lug'
+device = 'cuda' if torch.cuda.is_available() else 'cpu'
+model = transformers.Wav2Vec2ForCTC.from_pretrained(
+    'Sunbird/asr-mms-salt').to(device)
+model.load_adapter(language)
+processor = transformers.Wav2Vec2Processor.from_pretrained(
+    'Sunbird/asr-mms-salt')
+processor.tokenizer.set_target_lang(language)
+# Get some test audio
+ds = datasets.load_dataset('Sunbird/salt', 'multispeaker-lug', split='test')
+audio = ds[0]['audio']
+sample_rate = ds[0]['sample_rate']
+# Apply the model
+inputs = processor(audio, sampling_rate=sample_rate, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs.to(device)).logits
+ids = torch.argmax(outputs, dim=-1)[0]
+transcription = processor.decode(ids)
+print(transcription)
+# ekikola ky'akasooli kyakyenvu wabula langi yakyo etera okuba eyaakitaka wansi
+```