shangeth commited on
Commit
1b27420
1 Parent(s): d9290bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -53,4 +53,39 @@ model-index:
53
  - name: Test WER
54
  type: wer
55
  value: 25.01
56
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  - name: Test WER
54
  type: wer
55
  value: 25.01
56
+ ---
57
+
58
+ # SpeechLLM
59
+
60
+ ## Usage
61
+ ```python
62
+ # Load model directly from huggingface
63
+ from transformers import AutoModel
64
+ model = AutoModel.from_pretrained("skit-ai/SpeechLLM", trust_remote_code=True)
65
+
66
+ model.generate_meta(
67
+ audio_path="path-to-audio.wav",
68
+ instruction="Give me the following information about the audio [SpeechActivity, Transcript, Gender, Emotion, Age, Accent]",
69
+ max_new_tokens=500,
70
+ return_special_tokens=False
71
+ )
72
+
73
+ # Model Generation
74
+ '''
75
+ { "SpeechActivity" : "True",
76
+ "Transcript": "Yes, I got it. I'll make the payment now.",
77
+ "Gender": "Female",
78
+ "Emotion": "Neutral",
79
+ "Age": "Young",
80
+ "Accent" : "America",
81
+ }
82
+ '''
83
+ ```
84
+
85
+ ## Checkpoint Result
86
+
87
+ | Dataset | Word Error Rate(%) | Gender(%) |
88
+ |:----------------------:|:------------------:|:---------:|
89
+ | librispeech-test-clean | 0.1230 | 0.8778 |
90
+ | librispeech-test-other | 0.1890 | 0.8908 |
91
+ | CommonVoice test | 0.2501 | 0.8753 |