metadata
language:
- en
license: apache-2.0
tags:
- text-generation
- large-language-model
- orpo
dataset:
- jondurbin/truthy-dpo-v0.1
- AlekseyKorshuk/evol-codealpaca-v1-dpo
- argilla/distilabel-intel-orca-dpo-pairs
- argilla/ultrafeedback-binarized-avg-rating-for-dpo-filtered
- snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
- mlabonne/orpo-dpo-mix-40k
base_model:
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
model-index:
- name: Coven Tiny 1.1B
description: >-
Coven Tiny 1.1B is a derivative of TinyLlama 1.1B Chat, fine-tuned to
perform specialized tasks involving deeper understanding and reasoning
over context. This model exhibits strong capabilities in both general
language understanding and task-specific challenges.
results:
- task:
type: text-generation
name: Winogrande Challenge
dataset:
name: Winogrande
type: winogrande
config: winogrande_xl
split: test
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 61.17
name: accuracy
- task:
type: text-generation
name: TruthfulQA Generation
dataset:
name: TruthfulQA
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: accuracy
value: 34.31
name: accuracy
- task:
type: text-generation
name: PIQA Problem Solving
dataset:
name: PIQA
type: piqa
split: validation
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 71.06
name: accuracy
- task:
type: text-generation
name: OpenBookQA Facts
dataset:
name: OpenBookQA
type: openbookqa
split: test
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 30.6
name: accuracy
- task:
type: text-generation
name: MMLU Knowledge Test
dataset:
name: MMLU
type: mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 38.03
name: accuracy
- task:
type: text-generation
name: Hellaswag Contextual Completions
dataset:
name: Hellaswag
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: accuracy
value: 43.44
name: accuracy
- task:
type: text-generation
name: GSM8k Mathematical Reasoning
dataset:
name: GSM8k
type: gsm8k
split: test
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 14.71
name: exact match (strict)
- type: accuracy
value: 14.63
name: exact match (flexible)
- task:
type: text-generation
name: BoolQ Question Answering
dataset:
name: BoolQ
type: boolq
split: validation
args:
num_few_shot: 5
metrics:
- type: accuracy
value: 65.2
name: accuracy
- task:
type: text-generation
name: ARC Challenge
dataset:
name: ARC Challenge
type: ai2_arc
split: test
args:
num_few_shot: 25
metrics:
- type: accuracy
value: 34.81
name: accuracy
π€ Coven Tiny 1.1B 32K ORPO
Coven Tiny 1.1B 32K is an improved iteration of TinyLlama-1.1B-Chat-v1.0, refined to expand processing capabilities and refine language model preferences. This model includes a significantly increased context limit of 32K tokens, allowing for more extensive data processing and understanding of complex language scenarios. In addition, Coven Tiny 1.1B 32K uses the innovative ORPO (Monolithic Preference Optimization without Reference Model) technique. ORPO simplifies the fine-tuning process by directly optimizing the odds ratio to distinguish between favorable and unfavorable generation styles, effectively improving model performance without the need for an additional preference alignment step.
Model Details
- Model name: Coven Tiny 1.1B 32K ORPO alpha
- Fine-tuned by: raidhon
- Base model: TinyLlama-1.1B-Chat-v1.0
- Parameters: 1.1B
- Context: 32K
- Language(s): Multilingual
- License: Apache2.0
Eval
Task | Model | Metric | Value | Change (%) |
---|---|---|---|---|
Winogrande | TinyLlama 1.1B Chat | Accuracy | 61.56% | - |
Coven Tiny 1.1B | Accuracy | 61.17% | -0.63% | |
TruthfulQA | TinyLlama 1.1B Chat | Accuracy | 30.43% | - |
Coven Tiny 1.1B | Accuracy | 34.31% | +12.75% | |
PIQA | TinyLlama 1.1B Chat | Accuracy | 74.10% | - |
Coven Tiny 1.1B | Accuracy | 71.06% | -4.10% | |
OpenBookQA | TinyLlama 1.1B Chat | Accuracy | 27.40% | - |
Coven Tiny 1.1B | Accuracy | 30.60% | +11.68% | |
MMLU | TinyLlama 1.1B Chat | Accuracy | 24.31% | - |
Coven Tiny 1.1B | Accuracy | 38.03% | +56.44% | |
Hellaswag | TinyLlama 1.1B Chat | Accuracy | 45.69% | - |
Coven Tiny 1.1B | Accuracy | 43.44% | -4.92% | |
GSM8K (Strict) | TinyLlama 1.1B Chat | Exact Match | 1.82% | - |
Coven Tiny 1.1B | Exact Match | 14.71% | +708.24% | |
GSM8K (Flexible) | TinyLlama 1.1B Chat | Exact Match | 2.65% | - |
Coven Tiny 1.1B | Exact Match | 14.63% | +452.08% | |
BoolQ | TinyLlama 1.1B Chat | Accuracy | 58.69% | - |
Coven Tiny 1.1B | Accuracy | 65.20% | +11.09% | |
ARC Easy | TinyLlama 1.1B Chat | Accuracy | 66.54% | - |
Coven Tiny 1.1B | Accuracy | 57.24% | -13.98% | |
ARC Challenge | TinyLlama 1.1B Chat | Accuracy | 34.13% | - |
Coven Tiny 1.1B | Accuracy | 34.81% | +1.99% | |
Humaneval | TinyLlama 1.1B Chat | Pass@1 | 10.98% | - |
Coven Tiny 1.1B | Pass@1 | 10.37% | -5.56% | |
Drop | TinyLlama 1.1B Chat | Score | 16.02% | - |
Coven Tiny 1.1B | Score | 16.36% | +2.12% | |
BBH | Coven Tiny 1.1B | Average | 29.02% | - |
π» Usage
# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="raidhon/coven_tiny_1.1b_32k_orpo_alpha", torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=2048, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])