lxe's picture
Adding Evaluation Results (#2)
914052b
---
license: apache-2.0
tags:
- code
---
## Cerebras-GPT-2.7B-Alpaca-SP
Cerebras-GPT-2.7B-Alpaca-SP is [cerebras/Cerebras-GPT-2.7B](https://huggingface.co/cerebras/Cerebras-GPT-2.7B) finetuned on a modified [Alpaca dataset](https://github.com/tloen/alpaca-lora/blob/main/alpaca_data_cleaned.json) with a shorter prompt structure (`Human:...\n\nAssistant:`). This contains the float16 model merged with [lxe/lora-cerebras-gpt2.7b-alpaca-shortprompt](https://huggingface.co/lxe/lora-cerebras-gpt2.7b-alpaca-shortprompt) LoRA.
### Limitations:
Although mostly coherent, the model tends to hallucinate signinficantly, often producing erroneous information.
### Basic Usage:
```python
tokenizer = transformers.AutoTokenizer.from_pretrained('lxe/Cerebras-GPT-2.7B-Alpaca-SP')
model = transformers.AutoModelForCausalLM.from_pretrained(
`lxe/Cerebras-GPT-2.7B-Alpaca-SP`,
load_in_8bit=True,
torch_dtype=torch.float16,
device_map={"": 0},
)
prompt = "Human: how old is the sun?\n\nAssistant:"
input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda()
with torch.no_grad():
output = model.generate(
input_ids=input_ids,
max_length=100,
early_stopping=True,
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
```
```
Human: how old is the sun?
Assistant: The sun is approximately 4.5 billion years old.
```
### Example Outputs:
**Human**: how old is the sun? <br />
**Assistant**: The sun is approximately 4.5 billion years old.
**Human**: what are the highest grossing films of the 2000-2010 decade? <br />
**Assistant**: The highest grossing film in 2010 was Avatar, which earned $2.5 billion worldwide and is currently ranked as one of the top 10 movies ever made by Box Office Mojo. Other notable high-grossing films include Titanic ($1.8 billion), Jurassic World ($1.6 billion) and Harry Potter And The Deathly Hallows Part 2 ($1.4 billion).
**Human**: how is cheese made? <br />
**Assistant**: Cheese making involves the process of curdling milk, separating it into a liquid and solid phase, then heating that mixture to create a semi-solid mass. The resulting product can be either hard or soft depending on its composition. Hard cheeses are typically produced from cow's milk while softer varieties come from sheep's milk.
**Human**: print all letters of the alphabet in reverse <br />
**Assistant**: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
**Human**: write a javascript function that sorts an array using quicksort algorithm <br />
**Assistant**:
```
function quicksort(array) {
if (array.length < 2) {
return array;
}
var pivot = array[0];
var left = quicksort(array, 0, array.length - 1);
var right = quicksort(array, array.length - 1, 0);
if (left < pivot) {
left = pivot;
}
if (right > pivot) {
right = pivot;
}
if (left < right) {
quicksort(array, left, right);
}
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lxe__Cerebras-GPT-2.7B-Alpaca-SP)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 29.4 |
| ARC (25-shot) | 30.8 |
| HellaSwag (10-shot) | 48.88 |
| MMLU (5-shot) | 25.12 |
| TruthfulQA (0-shot) | 40.24 |
| Winogrande (5-shot) | 55.41 |
| GSM8K (5-shot) | 0.53 |
| DROP (3-shot) | 4.78 |