OpenLLaMA Glaive: An Open Reproduction of LLaMA

This is an OpenLlama model Code Instruct that has been fine-tuned on 1 epoch of the Glaive Assistsnt dataset.

Prompt Template

<s>[INST] {{ user_msg }} [/INST]

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline

tokenizer = AutoTokenizer.from_pretrained("mwitiderrick/open_llama_3b_glaive_code_v0.1")
model = AutoModelForCausalLM.from_pretrained("mwitiderrick/open_llama_3b_glaive_v0.1")
query = "Write a quick sort algorithm in Python"
text_gen = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
output = text_gen(f"<s>[INST]{query}[/INST]")
print(output[0]['generated_text'])
"""
<s>[INST]Write a quick sort algorithm in Python[/INST]

Quick sort is a divide and conquer algorithm that sorts an array in-place.
It works by repeatedly dividing the array into two sub-arrays, sorting
them, and then merging them back together.

Here's a Python implementation of the quick sort algorithm:

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[len(arr) // 2]
        left = [x for x in arr if x < pivot]
        right = [x for x in arr if x > pivot]
        return quick_sort(left) + [pivot] + quick_sort
"""

Metrics

Detailed metrics

|  Tasks  |Version|Filter|n-shot| Metric |Value |   |Stderr|
|---------|-------|------|-----:|--------|-----:|---|-----:|
|hellaswag|Yaml   |none  |     0|acc     |0.4974|±  |0.0050|
|         |       |none  |     0|acc_norm|0.6600|±  |0.0047|
|  Groups  |Version|Filter|n-shot|  Metric   | Value  |   |Stderr|
|----------|-------|------|-----:|-----------|-------:|---|-----:|
|truthfulqa|N/A    |none  |     0|bleu_max   | 23.5771|±  |0.5407|
|          |       |none  |     0|bleu_acc   |  0.2754|±  |0.0002|
|          |       |none  |     0|bleu_diff  | -8.1019|±  |0.5137|
|          |       |none  |     0|rouge1_max | 49.5707|±  |0.6501|
|          |       |none  |     0|rouge1_acc |  0.2607|±  |0.0002|
|          |       |none  |     0|rouge1_diff| -9.8962|±  |0.5492|
|          |       |none  |     0|rouge2_max | 33.0399|±  |0.8237|
|          |       |none  |     0|rouge2_acc |  0.2313|±  |0.0002|
|          |       |none  |     0|rouge2_diff|-11.9054|±  |0.7963|
|          |       |none  |     0|rougeL_max | 46.3168|±  |0.6705|
|          |       |none  |     0|rougeL_acc |  0.2521|±  |0.0002|
|          |       |none  |     0|rougeL_diff|-10.1301|±  |0.5669|
|          |       |none  |     0|acc        |  0.3191|±  |0.0405|
|  Tasks   |Version|Filter|n-shot|Metric|Value |   |Stderr|
|----------|-------|------|-----:|------|-----:|---|-----:|
|winogrande|Yaml   |none  |     0|acc   |0.6322|±  |0.0136|
|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml   |none  |     0|acc     |0.3234|±  |0.0137|
|             |       |none  |     0|acc_norm|0.3447|±  |0.0139|

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 39.74
AI2 Reasoning Challenge (25-Shot) 40.70
HellaSwag (10-Shot) 67.45
MMLU (5-Shot) 27.74
TruthfulQA (0-shot) 35.86
Winogrande (5-shot) 64.72
GSM8k (5-shot) 1.97
Downloads last month
53
Safetensors
Model size
3.43B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mwitiderrick/open_llama_3b_glaive_code_v0.1

Dataset used to train mwitiderrick/open_llama_3b_glaive_code_v0.1

Evaluation results