TehVenom
/

Pygmalion-13b-Merged

Text Generation

text generation

text-generation-inference

Model card Files Files and versions Community

Adding Evaluation Results

#1

by leaderboard-pr-bot - opened Nov 17, 2023

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

@@ -117,3 +117,17 @@ Current evals out of the Pygmalion-13b model: <br>
 The intended use-case for this model is fictional conversation for entertainment purposes. Any other sort of usage is out of scope.
 As such, it was **not** fine-tuned to be safe and harmless: the base model _and_ this fine-tune have been trained on data known to contain profanity and texts that are lewd or otherwise offensive. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive. Outputs might often be factually wrong or misleading.

 The intended use-case for this model is fictional conversation for entertainment purposes. Any other sort of usage is out of scope.
 As such, it was **not** fine-tuned to be safe and harmless: the base model _and_ this fine-tune have been trained on data known to contain profanity and texts that are lewd or otherwise offensive. It may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive. Outputs might often be factually wrong or misleading.
+# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TehVenom__Pygmalion-13b-Merged)
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 44.8   |
+| ARC (25-shot)         | 56.48          |
+| HellaSwag (10-shot)   | 80.02    |
+| MMLU (5-shot)         | 42.93         |
+| TruthfulQA (0-shot)   | 35.86   |
+| Winogrande (5-shot)   | 75.53   |
+| GSM8K (5-shot)        | 0.08        |
+| DROP (3-shot)         | 22.67         |