Thytu commited on
Commit
2e2e3c6
1 Parent(s): 6ee1e34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -17,4 +17,56 @@ widget:
17
  pipeline_tag: text-generation
18
  ---
19
 
 
 
 
 
20
  Fine-tuned version of [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super) for ASR on [librispeech_asr](https://huggingface.co/datasets/librispeech_asr).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  pipeline_tag: text-generation
18
  ---
19
 
20
+ # Phi-2-audio-super
21
+
22
+ Base Model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)
23
+
24
  Fine-tuned version of [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super) for ASR on [librispeech_asr](https://huggingface.co/datasets/librispeech_asr).
25
+
26
+ ## How to run inference for text only:
27
+
28
+ ```python
29
+ import transformers
30
+ import torch
31
+
32
+ if __name__ == "__main__":
33
+ model_name = "abacaj/phi-2-audio-super"
34
+ tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
35
+
36
+ model = (
37
+ transformers.AutoModelForCausalLM.from_pretrained(
38
+ model_name,
39
+ )
40
+ .to("cuda:0")
41
+ .eval()
42
+ )
43
+
44
+ # Exactly like for phi-2-super :D
45
+ messages = [
46
+ {"role": "user", "content": "Hello, who are you?"}
47
+ ]
48
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
49
+ input_ids_cutoff = inputs.size(dim=1)
50
+
51
+ with torch.no_grad():
52
+ generated_ids = model.generate(
53
+ input_ids=inputs,
54
+ use_cache=True,
55
+ max_new_tokens=512,
56
+ temperature=0.2,
57
+ top_p=0.95,
58
+ do_sample=True,
59
+ eos_token_id=tokenizer.eos_token_id,
60
+ pad_token_id=tokenizer.pad_token_id,
61
+ )
62
+
63
+ completion = tokenizer.decode(
64
+ generated_ids[0][input_ids_cutoff:],
65
+ skip_special_tokens=True,
66
+ )
67
+
68
+ print(completion)
69
+ ```
70
+
71
+ ## How to run inference for ASR:
72
+ TODO