llmixer commited on
Commit
dd387dd
1 Parent(s): b674da2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - 152334H/miqu-1-70b-sf
4
+ license: unknown
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - merge
10
+ - frankenmerge
11
+ - 95b
12
+ ---
13
+ # BigWeave v26 95b
14
+
15
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65a6db055c58475cf9e6def1/4CbbAN-X7ZWj702JrcCGH.png" width=600>
16
+
17
+ The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
18
+
19
+ # Prompting Format
20
+ Chatml, Mistral, Vicuna.
21
+
22
+ # Merge process
23
+ This is a self-merge of 152334H/miqu-1-70b-sf. The last 30 layers are duplicated in groups of 10 layers. According to exl2 measurements, these are among the most important layers.
24
+
25
+ Merge configuration:
26
+ ```
27
+ slices:
28
+ - sources:
29
+ - model: 152334H/miqu-1-70b-sf
30
+ layer_range: [0,54]
31
+ - sources:
32
+ - model: 152334H/miqu-1-70b-sf
33
+ layer_range: [49,59]
34
+ - sources:
35
+ - model: 152334H/miqu-1-70b-sf
36
+ layer_range: [54,64]
37
+ - sources:
38
+ - model: 152334H/miqu-1-70b-sf
39
+ layer_range: [59,69]
40
+ - sources:
41
+ - model: 152334H/miqu-1-70b-sf
42
+ layer_range: [64,74]
43
+ - sources:
44
+ - model: 152334H/miqu-1-70b-sf
45
+ layer_range: [69,79]
46
+ - sources:
47
+ - model: 152334H/miqu-1-70b-sf
48
+ layer_range: [74,80]
49
+ merge_method: passthrough
50
+ dtype: float16
51
+
52
+ ```