|
--- |
|
license: apache-2.0 |
|
tags: |
|
- code |
|
--- |
|
|
|
## Cerebras-GPT-2.7B-Alpaca-SP |
|
|
|
Cerebras-GPT-2.7B-Alpaca-SP is [cerebras/Cerebras-GPT-2.7B](https://huggingface.co/cerebras/Cerebras-GPT-2.7B) finetuned on a modified [Alpaca dataset](https://github.com/tloen/alpaca-lora/blob/main/alpaca_data_cleaned.json) with a shorter prompt structure (`Human:...\n\nAssistant:`). This contains the float16 model merged with [lxe/lora-cerebras-gpt2.7b-alpaca-shortprompt](https://huggingface.co/lxe/lora-cerebras-gpt2.7b-alpaca-shortprompt) LoRA. |
|
|
|
|
|
### Limitations: |
|
|
|
Although mostly coherent, the model tends to hallucinate signinficantly, often producing erroneous information. |
|
|
|
### Basic Usage: |
|
|
|
```python |
|
tokenizer = transformers.AutoTokenizer.from_pretrained('lxe/Cerebras-GPT-2.7B-Alpaca-SP') |
|
|
|
model = transformers.AutoModelForCausalLM.from_pretrained( |
|
`lxe/Cerebras-GPT-2.7B-Alpaca-SP`, |
|
load_in_8bit=True, |
|
torch_dtype=torch.float16, |
|
device_map={"": 0}, |
|
) |
|
|
|
prompt = "Human: how old is the sun?\n\nAssistant:" |
|
input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda() |
|
|
|
with torch.no_grad(): |
|
output = model.generate( |
|
input_ids=input_ids, |
|
max_length=100, |
|
early_stopping=True, |
|
) |
|
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(generated_text) |
|
``` |
|
|
|
``` |
|
Human: how old is the sun? |
|
|
|
Assistant: The sun is approximately 4.5 billion years old. |
|
``` |
|
|
|
### Example Outputs: |
|
|
|
**Human**: how old is the sun? <br /> |
|
**Assistant**: The sun is approximately 4.5 billion years old. |
|
|
|
|
|
**Human**: what are the highest grossing films of the 2000-2010 decade? <br /> |
|
**Assistant**: The highest grossing film in 2010 was Avatar, which earned $2.5 billion worldwide and is currently ranked as one of the top 10 movies ever made by Box Office Mojo. Other notable high-grossing films include Titanic ($1.8 billion), Jurassic World ($1.6 billion) and Harry Potter And The Deathly Hallows Part 2 ($1.4 billion). |
|
|
|
|
|
**Human**: how is cheese made? <br /> |
|
**Assistant**: Cheese making involves the process of curdling milk, separating it into a liquid and solid phase, then heating that mixture to create a semi-solid mass. The resulting product can be either hard or soft depending on its composition. Hard cheeses are typically produced from cow's milk while softer varieties come from sheep's milk. |
|
|
|
|
|
**Human**: print all letters of the alphabet in reverse <br /> |
|
**Assistant**: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z |
|
|
|
|
|
**Human**: write a javascript function that sorts an array using quicksort algorithm <br /> |
|
**Assistant**: |
|
``` |
|
function quicksort(array) { |
|
if (array.length < 2) { |
|
return array; |
|
} |
|
var pivot = array[0]; |
|
var left = quicksort(array, 0, array.length - 1); |
|
var right = quicksort(array, array.length - 1, 0); |
|
if (left < pivot) { |
|
left = pivot; |
|
} |
|
if (right > pivot) { |
|
right = pivot; |
|
} |
|
if (left < right) { |
|
quicksort(array, left, right); |
|
} |
|
``` |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lxe__Cerebras-GPT-2.7B-Alpaca-SP) |
|
|
|
| Metric | Value | |
|
|-----------------------|---------------------------| |
|
| Avg. | 29.4 | |
|
| ARC (25-shot) | 30.8 | |
|
| HellaSwag (10-shot) | 48.88 | |
|
| MMLU (5-shot) | 25.12 | |
|
| TruthfulQA (0-shot) | 40.24 | |
|
| Winogrande (5-shot) | 55.41 | |
|
| GSM8K (5-shot) | 0.53 | |
|
| DROP (3-shot) | 4.78 | |
|
|