metadata
base_model: appvoid/no-prompt-1.3b
datasets:
- appvoid/no-prompt-15k
inference: false
language:
- en
license: apache-2.0
model_creator: appvoid
model_name: no-prompt-1.3b
pipeline_tag: text-generation
quantized_by: afrideva
tags:
- gguf
- ggml
- quantized
- q2_k
- q3_k_m
- q4_k_m
- q5_k_m
- q6_k
- q8_0
appvoid/no-prompt-1.3b-GGUF
Quantized GGUF model files for no-prompt-1.3b from appvoid
Name | Quant method | Size |
---|---|---|
no-prompt-1.3b.fp16.gguf | fp16 | 2.69 GB |
no-prompt-1.3b.q2_k.gguf | q2_k | 631.52 MB |
no-prompt-1.3b.q3_k_m.gguf | q3_k_m | 704.72 MB |
no-prompt-1.3b.q4_k_m.gguf | q4_k_m | 873.27 MB |
no-prompt-1.3b.q5_k_m.gguf | q5_k_m | 1.00 GB |
no-prompt-1.3b.q6_k.gguf | q6_k | 1.17 GB |
no-prompt-1.3b.q8_0.gguf | q8_0 | 1.43 GB |
Original Model Card:
no-prompt
a sheared-llama-1.3b fine-tuning
This model uses an 1.3 billion parameters model as base to be further fine-tuned on the same data as palmer. It works pretty good and even surpasses sota model on hellaswag
.
evaluation
Model | ARC_C | HellaSwag | PIQA | Winogrande |
---|---|---|---|---|
tinyllama-2t | 0.2807 | 0.5463 | 0.7067 | 0.5683 |
palmer-001 | 0.2807 | 0.5524 | 0.7106 | 0.5896 |
sheared-1.3b | 0.2910 | 0.5935 | 0.7339 | 0.5809 |
no-prompt-1.3b | 0.3157 | 0.6022 | 0.7334 | 0.5864 |
falcon-rw-1b-instruct-openorca (sota) | 0.3362 | 0.5997 | 0.7394 | 0.6148 |
This model was trained on less than 25% of the dataset yet achieves competitive performance to current sota on open llm leaderboard.
training
Training took ~5 P100 gpu hours. It was trained on 15,000 gpt-4 shuffled samples. no-prompt was fine-tuned using lower learning rates ensuring it keeps as much general knowledge as possible.
prompt
no prompt
limitations
Hallucinations are frequent, just as any transformer model this size.