macadeliccc commited on
Commit
ac685a1
1 Parent(s): 1636c24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -80
README.md CHANGED
@@ -67,86 +67,7 @@ print(generate_response(prompt), "\n")
67
  | | |none | 0|acc_norm|0.8058|± |0.0092|
68
  |winogrande |Yaml |none | 0|acc |0.7372|± |0.0124|
69
 
70
- | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
71
- |---------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
72
- |[SOLAR-math-2x10.7b](https://huggingface.co/macadeliccc/SOLAR-math-2x10.7b)| 47.2| 75.18| 64.73| 45.15| 58.07|
73
-
74
- ### AGIEval
75
- | Task |Version| Metric |Value| |Stderr|
76
- |------------------------------|------:|--------|----:|---|-----:|
77
- |agieval_aqua_rat | 0|acc |30.31|± | 2.89|
78
- | | |acc_norm|30.31|± | 2.89|
79
- |agieval_logiqa_en | 0|acc |43.78|± | 1.95|
80
- | | |acc_norm|43.93|± | 1.95|
81
- |agieval_lsat_ar | 0|acc |21.74|± | 2.73|
82
- | | |acc_norm|19.13|± | 2.60|
83
- |agieval_lsat_lr | 0|acc |57.25|± | 2.19|
84
- | | |acc_norm|56.47|± | 2.20|
85
- |agieval_lsat_rc | 0|acc |68.77|± | 2.83|
86
- | | |acc_norm|68.03|± | 2.85|
87
- |agieval_sat_en | 0|acc |78.16|± | 2.89|
88
- | | |acc_norm|79.13|± | 2.84|
89
- |agieval_sat_en_without_passage| 0|acc |47.57|± | 3.49|
90
- | | |acc_norm|44.66|± | 3.47|
91
- |agieval_sat_math | 0|acc |41.36|± | 3.33|
92
- | | |acc_norm|35.91|± | 3.24|
93
-
94
- Average: 47.2%
95
-
96
- ### GPT4All
97
- | Task |Version| Metric |Value| |Stderr|
98
- |-------------|------:|--------|----:|---|-----:|
99
- |arc_challenge| 0|acc |59.22|± | 1.44|
100
- | | |acc_norm|61.43|± | 1.42|
101
- |arc_easy | 0|acc |84.26|± | 0.75|
102
- | | |acc_norm|83.63|± | 0.76|
103
- |boolq | 1|acc |88.69|± | 0.55|
104
- |hellaswag | 0|acc |65.98|± | 0.47|
105
- | | |acc_norm|84.29|± | 0.36|
106
- |openbookqa | 0|acc |34.20|± | 2.12|
107
- | | |acc_norm|47.20|± | 2.23|
108
- |piqa | 0|acc |81.83|± | 0.90|
109
- | | |acc_norm|82.59|± | 0.88|
110
- |winogrande | 0|acc |78.45|± | 1.16|
111
-
112
- Average: 75.18%
113
-
114
- ### TruthfulQA
115
- | Task |Version|Metric|Value| |Stderr|
116
- |-------------|------:|------|----:|---|-----:|
117
- |truthfulqa_mc| 1|mc1 |48.47|± | 1.75|
118
- | | |mc2 |64.73|± | 1.53|
119
-
120
- Average: 64.73%
121
-
122
- ### Bigbench
123
- | Task |Version| Metric |Value| |Stderr|
124
- |------------------------------------------------|------:|---------------------|----:|---|-----:|
125
- |bigbench_causal_judgement | 0|multiple_choice_grade|61.05|± | 3.55|
126
- |bigbench_date_understanding | 0|multiple_choice_grade|68.56|± | 2.42|
127
- |bigbench_disambiguation_qa | 0|multiple_choice_grade|35.27|± | 2.98|
128
- |bigbench_geometric_shapes | 0|multiple_choice_grade|31.20|± | 2.45|
129
- | | |exact_str_match | 0.00|± | 0.00|
130
- |bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|30.00|± | 2.05|
131
- |bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|23.43|± | 1.60|
132
- |bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|46.00|± | 2.88|
133
- |bigbench_movie_recommendation | 0|multiple_choice_grade|35.60|± | 2.14|
134
- |bigbench_navigate | 0|multiple_choice_grade|57.50|± | 1.56|
135
- |bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|55.80|± | 1.11|
136
- |bigbench_ruin_names | 0|multiple_choice_grade|45.98|± | 2.36|
137
- |bigbench_salient_translation_error_detection | 0|multiple_choice_grade|40.58|± | 1.56|
138
- |bigbench_snarks | 0|multiple_choice_grade|66.85|± | 3.51|
139
- |bigbench_sports_understanding | 0|multiple_choice_grade|71.40|± | 1.44|
140
- |bigbench_temporal_sequences | 0|multiple_choice_grade|56.40|± | 1.57|
141
- |bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|24.00|± | 1.21|
142
- |bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|17.09|± | 0.90|
143
- |bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|46.00|± | 2.88|
144
-
145
- Average: 45.15%
146
-
147
- Average score: 58.07%
148
-
149
- Elapsed time: 04:05:27
150
 
151
  ### 📚 Citations
152
 
 
67
  | | |none | 0|acc_norm|0.8058|± |0.0092|
68
  |winogrande |Yaml |none | 0|acc |0.7372|± |0.0124|
69
 
70
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
  ### 📚 Citations
73