ubowang commited on
Commit
86c5f36
1 Parent(s): 8d679bb

Update utils.py

Browse files
Files changed (1) hide show
  1. utils.py +8 -8
utils.py CHANGED
@@ -33,6 +33,14 @@ Welcome to the MMLU-Pro leaderboard, showcasing the performance of various advan
33
 
34
  The MMLU-Pro dataset consists of approximately 12,000 intricate questions that challenge the comprehension and reasoning abilities of LLMs. Below you can find the accuracies of different models tested on this dataset.
35
 
 
 
 
 
 
 
 
 
36
  ## 1. What's new about MMLU-Pro
37
 
38
  Compared to the original MMLU, there are three major differences:
@@ -51,14 +59,6 @@ Compared to the original MMLU, there are three major differences:
51
  - **TheoremQA:** High-quality human-annotated questions requiring theorems to solve.
52
  - **Scibench:** Science questions from college exams.
53
 
54
- For detailed information about the dataset, visit our page on Hugging Face: MMLU-Pro at Hugging Face. If you are interested in replicating these results or wish to evaluate your models using our dataset, access our evaluation scripts available on GitHub: TIGER-AI-Lab/MMLU-Pro.
55
- """
56
-
57
- TABLE_INTRODUCTION = """
58
- """
59
-
60
- LEADERBOARD_INFO = """
61
- We list the information of the used datasets as follows:<br>
62
 
63
  """
64
 
 
33
 
34
  The MMLU-Pro dataset consists of approximately 12,000 intricate questions that challenge the comprehension and reasoning abilities of LLMs. Below you can find the accuracies of different models tested on this dataset.
35
 
36
+ For detailed information about the dataset, visit our page on Hugging Face: https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro. If you are interested in replicating these results or wish to evaluate your models using our dataset, access our evaluation scripts available on GitHub: https://github.com/TIGER-AI-Lab/MMLU-Pro.
37
+ """
38
+
39
+ TABLE_INTRODUCTION = """
40
+ """
41
+
42
+ LEADERBOARD_INFO = """
43
+
44
  ## 1. What's new about MMLU-Pro
45
 
46
  Compared to the original MMLU, there are three major differences:
 
59
  - **TheoremQA:** High-quality human-annotated questions requiring theorems to solve.
60
  - **Scibench:** Science questions from college exams.
61
 
 
 
 
 
 
 
 
 
62
 
63
  """
64