psyche
/

kollama2-7b-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kollama2-7b-v2 / README.md

psyche's picture

Adding Evaluation Results (#1)

11bff5c about 1 year ago

|

history blame contribute delete

808 Bytes

Under Construction

This model has some decoding patterns that origin from the fine-tunned dataset. So I will train the model to remove these patterns.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	44.91
ARC (25-shot)	53.33
HellaSwag (10-shot)	78.5
MMLU (5-shot)	43.61
TruthfulQA (0-shot)	46.37
Winogrande (5-shot)	75.61
GSM8K (5-shot)	6.52
DROP (3-shot)	10.4