Aura_v2_7B / README.md
jeiku's picture
Adding Evaluation Results (#1)
2c9d0e1 verified
|
raw
history blame
4.65 kB
---
language:
- en
license: apache-2.0
library_name: transformers
base_model:
- ResplendentAI/Paradigm_7B
- jeiku/Theory_of_Mind_Mistral
- ResplendentAI/Paradigm_7B
- jeiku/selfbot_256_mistral
- ResplendentAI/Paradigm_7B
- jeiku/Gnosis_Reformatted_Mistral
- ResplendentAI/Paradigm_7B
model-index:
- name: Aura_v2_7B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 73.46
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 88.64
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 63.97
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 75.17
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 84.45
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 66.49
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ResplendentAI/Aura_v2_7B
name: Open LLM Leaderboard
---
# Aura v2
![image/png](https://cdn-uploads.huggingface.co/production/uploads/626dfb8786671a29c715f8a9/tIy1fnUYHc7v_N6ym6Z7g.png)
The second version of the Aura line is a direct improvement over the original. Expect poetic and eloquent outputs with real emotion behind them.
I recommend keeping the temperature around 1.5 or lower with a Min P value of 0.05. This model can get carried away with prose at higher temperature. I will say though that the prose of this model is distinct from the GPT 3.5/4 variant, and lends an air of humanity to the outputs. I am aware that this model is overfit, but that was the point of the entire exercise.
If you have trouble getting the model to follow an asterisks/quote format, I recommend asterisks/plaintext instead. This model skews toward shorter outputs, so be prepared to lengthen your introduction and examples if you want longer outputs.
This model responds best to ChatML for multiturn conversations.
This model, like all other Mistral based models, is compatible with a Mistral compatible mmproj file for multimodal vision capabilities in KoboldCPP.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ResplendentAI__Aura_v2_7B)
| Metric |Value|
|---------------------------------|----:|
|Avg. |75.36|
|AI2 Reasoning Challenge (25-Shot)|73.46|
|HellaSwag (10-Shot) |88.64|
|MMLU (5-Shot) |63.97|
|TruthfulQA (0-shot) |75.17|
|Winogrande (5-shot) |84.45|
|GSM8k (5-shot) |66.49|