File size: 6,804 Bytes
1d68918
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
---
library_name: transformers
datasets:
- argilla/distilabel-capybara-dpo-7k-binarized
---
# CapyLake-7B-v2-laser

This model is a finetune of [cognitivecomputations/WestLake-7B-v2-Laser](https://huggingface.co/cognitivecomputations/WestLake-7B-v2-laser) on [argilla/distilabel-capybara-dpo-7k-binarized](https://huggingface.co/datasets/argilla/distilabel-capybara-dpo-7k-binarized)

<div align="center">  

![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6455cc8d679315e4ef16fbec/kx2uwS_kZ-rTAJiusSrAW.webp)

[<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-dark.png" alt="Built with Distilabel" width="200" height="32"/>](https://github.com/argilla-io/distilabel)

</div>

## Process

+ Realigned the chat template to ChatML 
+ Completed 1 Epoch
+ 5e-05 learning rate
+ Training time was about 2 hours on 1 H100
+ Cost was ~$8

## Code Example

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "macadeliccc/CapyLake-7B-v2-laser"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

text = "Create an idea for a TV show and write a short pilot script"
inputs = tokenizer(text, return_tensors="pt")

# Adding hyperparameters to the generation call
outputs = model.generate(
    **inputs,
    max_new_tokens=4096,  # Controls the maximum length of the new tokens created
    temperature=0.7,  # Adjust for creativity (lower is less random)
    top_k=50,  # Keeps the top k tokens for sampling
    top_p=0.95,  # Uses nucleus sampling with this cumulative probability
    num_return_sequences=1,  # Number of sequences to generate
    no_repeat_ngram_size=2,  # Prevents repeating n-grams to ensure diversity
    early_stopping=True  # Stops generation when all sequences reach the EOS token
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Other Capy Models
 
SOLAR-10.7B-Capy-v1.0 is also on the way. There could be more depending on performance!

## Evaluations

|                                     Model                                     |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[CapyLake-7B-v2-laser](https://huggingface.co/macadeliccc/CapyLake-7B-v2-laser)|  44.34|  77.77|     68.47|   47.92|  59.62|

### AGIEval
|             Task             |Version| Metric |Value|   |Stderr|
|------------------------------|------:|--------|----:|---|-----:|
|agieval_aqua_rat              |      0|acc     |28.35|±  |  2.83|
|                              |       |acc_norm|25.98|±  |  2.76|
|agieval_logiqa_en             |      0|acc     |38.86|±  |  1.91|
|                              |       |acc_norm|39.02|±  |  1.91|
|agieval_lsat_ar               |      0|acc     |25.22|±  |  2.87|
|                              |       |acc_norm|24.35|±  |  2.84|
|agieval_lsat_lr               |      0|acc     |50.39|±  |  2.22|
|                              |       |acc_norm|51.57|±  |  2.22|
|agieval_lsat_rc               |      0|acc     |65.06|±  |  2.91|
|                              |       |acc_norm|63.94|±  |  2.93|
|agieval_sat_en                |      0|acc     |78.64|±  |  2.86|
|                              |       |acc_norm|78.64|±  |  2.86|
|agieval_sat_en_without_passage|      0|acc     |40.78|±  |  3.43|
|                              |       |acc_norm|40.78|±  |  3.43|
|agieval_sat_math              |      0|acc     |33.64|±  |  3.19|
|                              |       |acc_norm|30.45|±  |  3.11|

Average: 44.34%

### GPT4All
|    Task     |Version| Metric |Value|   |Stderr|
|-------------|------:|--------|----:|---|-----:|
|arc_challenge|      0|acc     |66.89|±  |  1.38|
|             |       |acc_norm|67.49|±  |  1.37|
|arc_easy     |      0|acc     |86.70|±  |  0.70|
|             |       |acc_norm|81.90|±  |  0.79|
|boolq        |      1|acc     |88.10|±  |  0.57|
|hellaswag    |      0|acc     |71.45|±  |  0.45|
|             |       |acc_norm|87.78|±  |  0.33|
|openbookqa   |      0|acc     |39.80|±  |  2.19|
|             |       |acc_norm|49.80|±  |  2.24|
|piqa         |      0|acc     |82.86|±  |  0.88|
|             |       |acc_norm|84.87|±  |  0.84|
|winogrande   |      0|acc     |84.45|±  |  1.02|

Average: 77.77%

### TruthfulQA
|    Task     |Version|Metric|Value|   |Stderr|
|-------------|------:|------|----:|---|-----:|
|truthfulqa_mc|      1|mc1   |53.98|±  |  1.74|
|             |       |mc2   |68.47|±  |  1.53|

Average: 68.47%

### Bigbench

|                      Task                      |Version|       Metric        |Value|   |Stderr|
|------------------------------------------------|------:|---------------------|----:|---|-----:|
|bigbench_causal_judgement                       |      0|multiple_choice_grade|59.47|±  |  3.57|
|bigbench_date_understanding                     |      0|multiple_choice_grade|64.50|±  |  2.49|
|bigbench_disambiguation_qa                      |      0|multiple_choice_grade|44.96|±  |  3.10|
|bigbench_geometric_shapes                       |      0|multiple_choice_grade|22.84|±  |  2.22|
|                                                |       |exact_str_match      | 2.79|±  |  0.87|
|bigbench_logical_deduction_five_objects         |      0|multiple_choice_grade|30.80|±  |  2.07|
|bigbench_logical_deduction_seven_objects        |      0|multiple_choice_grade|21.57|±  |  1.56|
|bigbench_logical_deduction_three_objects        |      0|multiple_choice_grade|56.67|±  |  2.87|
|bigbench_movie_recommendation                   |      0|multiple_choice_grade|51.60|±  |  2.24|
|bigbench_navigate                               |      0|multiple_choice_grade|51.00|±  |  1.58|
|bigbench_reasoning_about_colored_objects        |      0|multiple_choice_grade|70.35|±  |  1.02|
|bigbench_ruin_names                             |      0|multiple_choice_grade|51.79|±  |  2.36|
|bigbench_salient_translation_error_detection    |      0|multiple_choice_grade|35.97|±  |  1.52|
|bigbench_snarks                                 |      0|multiple_choice_grade|79.01|±  |  3.04|
|bigbench_sports_understanding                   |      0|multiple_choice_grade|75.66|±  |  1.37|
|bigbench_temporal_sequences                     |      0|multiple_choice_grade|47.90|±  |  1.58|
|bigbench_tracking_shuffled_objects_five_objects |      0|multiple_choice_grade|23.84|±  |  1.21|
|bigbench_tracking_shuffled_objects_seven_objects|      0|multiple_choice_grade|18.00|±  |  0.92|
|bigbench_tracking_shuffled_objects_three_objects|      0|multiple_choice_grade|56.67|±  |  2.87|

Average: 47.92%

Average score: 59.62%

Elapsed time: 01:57:56