leaderboard-pr-bot commited on
Commit
e7c6ec9
1 Parent(s): bc39ba6

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +126 -6
README.md CHANGED
@@ -1,15 +1,135 @@
1
  ---
2
  license: mit
3
  widget:
4
- - text: |
5
- <|system|>
6
- You are a helpful assistant</s>
7
- <|user|>
8
- Explain to me how black holes are formed</s>
9
- <|assistant|>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
  Overview Cinder is an AI chatbot tailored for engaging users in scientific and educational conversations, offering companionship, and sparking imaginative exploration.
12
  It is built on the TinyLlama 1.1B parameter model and trained on a unique combination of datasets.
13
 
14
 
15
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/Jv2SVm0sWMjrAUIESoB3K.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  widget:
4
+ - text: '<|system|>
5
+
6
+ You are a helpful assistant</s>
7
+
8
+ <|user|>
9
+
10
+ Explain to me how black holes are formed</s>
11
+
12
+ <|assistant|>'
13
+ model-index:
14
+ - name: TinyLlama-3T-Cinder-v1.3
15
+ results:
16
+ - task:
17
+ type: text-generation
18
+ name: Text Generation
19
+ dataset:
20
+ name: AI2 Reasoning Challenge (25-Shot)
21
+ type: ai2_arc
22
+ config: ARC-Challenge
23
+ split: test
24
+ args:
25
+ num_few_shot: 25
26
+ metrics:
27
+ - type: acc_norm
28
+ value: 33.96
29
+ name: normalized accuracy
30
+ source:
31
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
32
+ name: Open LLM Leaderboard
33
+ - task:
34
+ type: text-generation
35
+ name: Text Generation
36
+ dataset:
37
+ name: HellaSwag (10-Shot)
38
+ type: hellaswag
39
+ split: validation
40
+ args:
41
+ num_few_shot: 10
42
+ metrics:
43
+ - type: acc_norm
44
+ value: 58.14
45
+ name: normalized accuracy
46
+ source:
47
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
48
+ name: Open LLM Leaderboard
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: MMLU (5-Shot)
54
+ type: cais/mmlu
55
+ config: all
56
+ split: test
57
+ args:
58
+ num_few_shot: 5
59
+ metrics:
60
+ - type: acc
61
+ value: 25.41
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
65
+ name: Open LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: TruthfulQA (0-shot)
71
+ type: truthful_qa
72
+ config: multiple_choice
73
+ split: validation
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: mc2
78
+ value: 38.13
79
+ source:
80
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
81
+ name: Open LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Winogrande (5-shot)
87
+ type: winogrande
88
+ config: winogrande_xl
89
+ split: validation
90
+ args:
91
+ num_few_shot: 5
92
+ metrics:
93
+ - type: acc
94
+ value: 63.93
95
+ name: accuracy
96
+ source:
97
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
98
+ name: Open LLM Leaderboard
99
+ - task:
100
+ type: text-generation
101
+ name: Text Generation
102
+ dataset:
103
+ name: GSM8k (5-shot)
104
+ type: gsm8k
105
+ config: main
106
+ split: test
107
+ args:
108
+ num_few_shot: 5
109
+ metrics:
110
+ - type: acc
111
+ value: 3.79
112
+ name: accuracy
113
+ source:
114
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Josephgflowers/TinyLlama-3T-Cinder-v1.3
115
+ name: Open LLM Leaderboard
116
  ---
117
  Overview Cinder is an AI chatbot tailored for engaging users in scientific and educational conversations, offering companionship, and sparking imaginative exploration.
118
  It is built on the TinyLlama 1.1B parameter model and trained on a unique combination of datasets.
119
 
120
 
121
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6328952f798f8d122ce62a44/Jv2SVm0sWMjrAUIESoB3K.png)
122
+
123
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
124
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Josephgflowers__TinyLlama-3T-Cinder-v1.3)
125
+
126
+ | Metric |Value|
127
+ |---------------------------------|----:|
128
+ |Avg. |37.23|
129
+ |AI2 Reasoning Challenge (25-Shot)|33.96|
130
+ |HellaSwag (10-Shot) |58.14|
131
+ |MMLU (5-Shot) |25.41|
132
+ |TruthfulQA (0-shot) |38.13|
133
+ |Winogrande (5-shot) |63.93|
134
+ |GSM8k (5-shot) | 3.79|
135
+