OpenLLMFrenchLeaderboard

Running on CPU Upgrade

malhajar commited on Apr 2, 2024

Commit

a64b482

verified ·

1 Parent(s): 77ff79b

Update src/display/about.py

Files changed (1) hide show

src/display/about.py CHANGED Viewed

@@ -45,12 +45,12 @@ I use LM-Evaluation-Harness-Turkish, a version of the LM Evaluation Harness adap
 ## How to Reproduce Results:
-# 1) Set Up the repo: Clone the "lm-evaluation-harness_turkish" from https://github.com/malhajar17/lm-evaluation-harness_turkish and follow the installation instructions.
-# 2) Run Evaluations: To get the results as on the leaderboard (Some tests might show small variations), use the following command, adjusting for your model. For example, with the Trendyol model:
 ```python
 lm_eval --model vllm --model_args pretrained=Trendyol/Trendyol-LLM-7b-chat-v1.0 --tasks truthfulqa_mc2_tr,truthfulqa_mc1_tr,mmlu_tr,winogrande_tr,gsm8k_tr,arc_challenge_tr,hellaswag_tr --output /workspace/Trendyol-LLM-7b-chat-v1.0
 ```
-# 3) Report Results: I take the average of truthfulqa_mc1_tr and truthfulqa_mc2_tr scores and report it as truthfulqa. The results file generated is then uploaded to the OpenLLM Turkish Leaderboard.
 ## Notes:

 ## How to Reproduce Results:
+1) Set Up the repo: Clone the "lm-evaluation-harness_turkish" from https://github.com/malhajar17/lm-evaluation-harness_turkish and follow the installation instructions.
+2) Run Evaluations: To get the results as on the leaderboard (Some tests might show small variations), use the following command, adjusting for your model. For example, with the Trendyol model:
 ```python
 lm_eval --model vllm --model_args pretrained=Trendyol/Trendyol-LLM-7b-chat-v1.0 --tasks truthfulqa_mc2_tr,truthfulqa_mc1_tr,mmlu_tr,winogrande_tr,gsm8k_tr,arc_challenge_tr,hellaswag_tr --output /workspace/Trendyol-LLM-7b-chat-v1.0
 ```
+3) Report Results: I take the average of truthfulqa_mc1_tr and truthfulqa_mc2_tr scores and report it as truthfulqa. The results file generated is then uploaded to the OpenLLM Turkish Leaderboard.
 ## Notes: