NovelHack-ja commited on
Commit
04c1e91
1 Parent(s): 53a0aa3

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,1050 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - mistralai/Mistral-7B-v0.3
4
+ - meta-math/MetaMath-Mistral-7B
5
+ - uukuguy/speechless-zephyr-code-functionary-7b
6
+ library_name: transformers
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+
11
+ ---
12
+ # Yosegi-2
13
+
14
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
+
16
+ ## Merge Details
17
+ ### Merge Method
18
+
19
+ This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using ./Yosegi-0601 as a base.
20
+
21
+ ### Models Merged
22
+
23
+ The following models were included in the merge:
24
+ * [mistralai/Mistral-7B-v0.3](https://huggingface.co/mistralai/Mistral-7B-v0.3)
25
+ * [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B)
26
+ * ./Ninja-v1-RP-expressive-v2-LoRA
27
+ * [uukuguy/speechless-zephyr-code-functionary-7b](https://huggingface.co/uukuguy/speechless-zephyr-code-functionary-7b)
28
+
29
+ ### Configuration
30
+
31
+ The following YAML configuration was used to produce this model:
32
+
33
+ ```yaml
34
+ base_model: ./Yosegi-0601
35
+ dtype: bfloat16
36
+ merge_method: ties
37
+ parameters:
38
+ int8_mask: 1.0
39
+ normalize: 0.0
40
+ slices:
41
+ - sources:
42
+ - layer_range: [0, 2]
43
+ model: mistralai/Mistral-7B-v0.3
44
+ parameters:
45
+ density:
46
+ - filter: self_attn
47
+ value: 1.0
48
+ - filter: mlp
49
+ value: 0.9895701336232673
50
+ - value: 1.0
51
+ weight:
52
+ - filter: self_attn
53
+ value: 0.5057237984975562
54
+ - filter: mlp
55
+ value: 0.36247235528151495
56
+ - value: 0.0076810835717692014
57
+ - layer_range: [0, 2]
58
+ model: meta-math/MetaMath-Mistral-7B
59
+ parameters:
60
+ density:
61
+ - filter: self_attn
62
+ value: 1.0
63
+ - filter: mlp
64
+ value: 1.0
65
+ - value: 0.8239779346577963
66
+ weight:
67
+ - filter: self_attn
68
+ value: 0.27499287617186813
69
+ - filter: mlp
70
+ value: 0.10579959634086915
71
+ - value: 0.14502290477239704
72
+ - layer_range: [0, 2]
73
+ model: uukuguy/speechless-zephyr-code-functionary-7b
74
+ parameters:
75
+ density:
76
+ - filter: self_attn
77
+ value: 1.0
78
+ - filter: mlp
79
+ value: 0.9654867628269999
80
+ - value: 0.9584724004158125
81
+ weight:
82
+ - filter: self_attn
83
+ value: 0.059719404899177556
84
+ - filter: mlp
85
+ value: 0.1299695859327612
86
+ - value: 0.18821871354400985
87
+ - layer_range: [0, 2]
88
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
89
+ parameters:
90
+ density:
91
+ - filter: self_attn
92
+ value: 0.9322987005873715
93
+ - filter: mlp
94
+ value: 0.8119693860979944
95
+ - value: 0.7800996941956229
96
+ weight:
97
+ - filter: self_attn
98
+ value: 0.14989333734000856
99
+ - filter: mlp
100
+ value: 0.20525182711733667
101
+ - value: 0.0743540962371737
102
+ - layer_range: [0, 2]
103
+ model: ./Yosegi-0601
104
+ - sources:
105
+ - layer_range: [2, 4]
106
+ model: mistralai/Mistral-7B-v0.3
107
+ parameters:
108
+ density:
109
+ - filter: self_attn
110
+ value: 0.6361163471256639
111
+ - filter: mlp
112
+ value: 0.9983948965135213
113
+ - value: 1.0
114
+ weight:
115
+ - filter: self_attn
116
+ value: 0.2433049522842103
117
+ - filter: mlp
118
+ value: 0.11537153133586801
119
+ - value: 0.11236945502439658
120
+ - layer_range: [2, 4]
121
+ model: meta-math/MetaMath-Mistral-7B
122
+ parameters:
123
+ density:
124
+ - filter: self_attn
125
+ value: 1.0
126
+ - filter: mlp
127
+ value: 1.0
128
+ - value: 1.0
129
+ weight:
130
+ - filter: self_attn
131
+ value: 0.13087986863180992
132
+ - filter: mlp
133
+ value: 0.05060452788200992
134
+ - value: 0.029882383396623725
135
+ - layer_range: [2, 4]
136
+ model: uukuguy/speechless-zephyr-code-functionary-7b
137
+ parameters:
138
+ density:
139
+ - filter: self_attn
140
+ value: 0.9938109261305853
141
+ - filter: mlp
142
+ value: 0.709432587913349
143
+ - value: 1.0
144
+ weight:
145
+ - filter: self_attn
146
+ value: 0.15343343058938377
147
+ - filter: mlp
148
+ value: 0.4105917936868785
149
+ - value: 0.6078632204623161
150
+ - layer_range: [2, 4]
151
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
152
+ parameters:
153
+ density:
154
+ - filter: self_attn
155
+ value: 1.0
156
+ - filter: mlp
157
+ value: 1.0
158
+ - value: 0.9634269234020544
159
+ weight:
160
+ - filter: self_attn
161
+ value: 0.03750763360681478
162
+ - filter: mlp
163
+ value: 0.29089122858987404
164
+ - value: 0.3408085857388722
165
+ - layer_range: [2, 4]
166
+ model: ./Yosegi-0601
167
+ - sources:
168
+ - layer_range: [4, 6]
169
+ model: mistralai/Mistral-7B-v0.3
170
+ parameters:
171
+ density:
172
+ - filter: self_attn
173
+ value: 0.8057109303418598
174
+ - filter: mlp
175
+ value: 0.9954520808628292
176
+ - value: 1.0
177
+ weight:
178
+ - filter: self_attn
179
+ value: 0.02598285706585618
180
+ - filter: mlp
181
+ value: 0.06661629726622949
182
+ - value: 0.1285191000066376
183
+ - layer_range: [4, 6]
184
+ model: meta-math/MetaMath-Mistral-7B
185
+ parameters:
186
+ density:
187
+ - filter: self_attn
188
+ value: 0.9112825916608848
189
+ - filter: mlp
190
+ value: 0.9322557507910056
191
+ - value: 1.0
192
+ weight:
193
+ - filter: self_attn
194
+ value: 0.18823564379986454
195
+ - filter: mlp
196
+ value: 0.4552822441636322
197
+ - value: 0.5120525709221785
198
+ - layer_range: [4, 6]
199
+ model: uukuguy/speechless-zephyr-code-functionary-7b
200
+ parameters:
201
+ density:
202
+ - filter: self_attn
203
+ value: 0.9869122169774399
204
+ - filter: mlp
205
+ value: 1.0
206
+ - value: 0.9751291459565757
207
+ weight:
208
+ - filter: self_attn
209
+ value: 0.00493134813843582
210
+ - filter: mlp
211
+ value: 0.3008979965262413
212
+ - value: 0.2528466849993097
213
+ - layer_range: [4, 6]
214
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
215
+ parameters:
216
+ density:
217
+ - filter: self_attn
218
+ value: 1.0
219
+ - filter: mlp
220
+ value: 0.8956512783019246
221
+ - value: 1.0
222
+ weight:
223
+ - filter: self_attn
224
+ value: 0.4197408619693966
225
+ - filter: mlp
226
+ value: 0.1448902874618845
227
+ - value: 0.5196932662212128
228
+ - layer_range: [4, 6]
229
+ model: ./Yosegi-0601
230
+ - sources:
231
+ - layer_range: [6, 8]
232
+ model: mistralai/Mistral-7B-v0.3
233
+ parameters:
234
+ density:
235
+ - filter: self_attn
236
+ value: 1.0
237
+ - filter: mlp
238
+ value: 1.0
239
+ - value: 1.0
240
+ weight:
241
+ - filter: self_attn
242
+ value: 0.05321377226808306
243
+ - filter: mlp
244
+ value: 0.0482589904702303
245
+ - value: 0.433407006546336
246
+ - layer_range: [6, 8]
247
+ model: meta-math/MetaMath-Mistral-7B
248
+ parameters:
249
+ density:
250
+ - filter: self_attn
251
+ value: 0.8300482882633113
252
+ - filter: mlp
253
+ value: 0.8951636861593875
254
+ - value: 1.0
255
+ weight:
256
+ - filter: self_attn
257
+ value: 0.35952608658046414
258
+ - filter: mlp
259
+ value: 0.17385333183950857
260
+ - value: 0.6366514725970246
261
+ - layer_range: [6, 8]
262
+ model: uukuguy/speechless-zephyr-code-functionary-7b
263
+ parameters:
264
+ density:
265
+ - filter: self_attn
266
+ value: 0.7848308077099464
267
+ - filter: mlp
268
+ value: 0.869549457974157
269
+ - value: 1.0
270
+ weight:
271
+ - filter: self_attn
272
+ value: 0.12433943050311849
273
+ - filter: mlp
274
+ value: 0.3065832590226165
275
+ - value: 0.33138948726149514
276
+ - layer_range: [6, 8]
277
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
278
+ parameters:
279
+ density:
280
+ - filter: self_attn
281
+ value: 1.0
282
+ - filter: mlp
283
+ value: 1.0
284
+ - value: 1.0
285
+ weight:
286
+ - filter: self_attn
287
+ value: 0.11885967308786714
288
+ - filter: mlp
289
+ value: 0.29125668567121127
290
+ - value: 0.19251901269486088
291
+ - layer_range: [6, 8]
292
+ model: ./Yosegi-0601
293
+ - sources:
294
+ - layer_range: [8, 10]
295
+ model: mistralai/Mistral-7B-v0.3
296
+ parameters:
297
+ density:
298
+ - filter: self_attn
299
+ value: 1.0
300
+ - filter: mlp
301
+ value: 0.9429625513013793
302
+ - value: 1.0
303
+ weight:
304
+ - filter: self_attn
305
+ value: 0.4085396076816443
306
+ - filter: mlp
307
+ value: 0.038473657720644636
308
+ - value: 0.35014489493395495
309
+ - layer_range: [8, 10]
310
+ model: meta-math/MetaMath-Mistral-7B
311
+ parameters:
312
+ density:
313
+ - filter: self_attn
314
+ value: 1.0
315
+ - filter: mlp
316
+ value: 1.0
317
+ - value: 1.0
318
+ weight:
319
+ - filter: self_attn
320
+ value: 0.26957216810533163
321
+ - filter: mlp
322
+ value: 0.2393300696241166
323
+ - value: 0.4735322427351712
324
+ - layer_range: [8, 10]
325
+ model: uukuguy/speechless-zephyr-code-functionary-7b
326
+ parameters:
327
+ density:
328
+ - filter: self_attn
329
+ value: 0.8594757954447017
330
+ - filter: mlp
331
+ value: 1.0
332
+ - value: 1.0
333
+ weight:
334
+ - filter: self_attn
335
+ value: 0.26101395702355007
336
+ - filter: mlp
337
+ value: 0.3147672140145126
338
+ - value: 0.11658182776184756
339
+ - layer_range: [8, 10]
340
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
341
+ parameters:
342
+ density:
343
+ - filter: self_attn
344
+ value: 1.0
345
+ - filter: mlp
346
+ value: 0.6948062341711919
347
+ - value: 0.9312401427737346
348
+ weight:
349
+ - filter: self_attn
350
+ value: 0.1987774487170517
351
+ - filter: mlp
352
+ value: 0.5628384475763534
353
+ - value: 0.2765378221890683
354
+ - layer_range: [8, 10]
355
+ model: ./Yosegi-0601
356
+ - sources:
357
+ - layer_range: [10, 12]
358
+ model: mistralai/Mistral-7B-v0.3
359
+ parameters:
360
+ density:
361
+ - filter: self_attn
362
+ value: 1.0
363
+ - filter: mlp
364
+ value: 0.8230035654228713
365
+ - value: 1.0
366
+ weight:
367
+ - filter: self_attn
368
+ value: 0.1741591536775035
369
+ - filter: mlp
370
+ value: 0.30563583223301516
371
+ - value: 0.2060419023239155
372
+ - layer_range: [10, 12]
373
+ model: meta-math/MetaMath-Mistral-7B
374
+ parameters:
375
+ density:
376
+ - filter: self_attn
377
+ value: 1.0
378
+ - filter: mlp
379
+ value: 0.9991063013557119
380
+ - value: 1.0
381
+ weight:
382
+ - filter: self_attn
383
+ value: 0.1470996125766866
384
+ - filter: mlp
385
+ value: 0.06646481892400827
386
+ - value: 0.2645489609472036
387
+ - layer_range: [10, 12]
388
+ model: uukuguy/speechless-zephyr-code-functionary-7b
389
+ parameters:
390
+ density:
391
+ - filter: self_attn
392
+ value: 0.6812899560643833
393
+ - filter: mlp
394
+ value: 0.9083104648631823
395
+ - value: 0.9730062683598184
396
+ weight:
397
+ - filter: self_attn
398
+ value: 0.14278507832578724
399
+ - filter: mlp
400
+ value: 0.3475945971407978
401
+ - value: 0.40266546962595284
402
+ - layer_range: [10, 12]
403
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
404
+ parameters:
405
+ density:
406
+ - filter: self_attn
407
+ value: 0.7047231879232164
408
+ - filter: mlp
409
+ value: 0.9148432633716144
410
+ - value: 1.0
411
+ weight:
412
+ - filter: self_attn
413
+ value: 0.15341559366405985
414
+ - filter: mlp
415
+ value: 0.20047704006010095
416
+ - value: 0.17364445581398172
417
+ - layer_range: [10, 12]
418
+ model: ./Yosegi-0601
419
+ - sources:
420
+ - layer_range: [12, 14]
421
+ model: mistralai/Mistral-7B-v0.3
422
+ parameters:
423
+ density:
424
+ - filter: self_attn
425
+ value: 0.6974090973508299
426
+ - filter: mlp
427
+ value: 1.0
428
+ - value: 0.9553573565285324
429
+ weight:
430
+ - filter: self_attn
431
+ value: 0.03614401712451334
432
+ - filter: mlp
433
+ value: 0.1287785039219736
434
+ - value: 0.3780545754310749
435
+ - layer_range: [12, 14]
436
+ model: meta-math/MetaMath-Mistral-7B
437
+ parameters:
438
+ density:
439
+ - filter: self_attn
440
+ value: 0.7857328784783159
441
+ - filter: mlp
442
+ value: 1.0
443
+ - value: 0.6631303877423032
444
+ weight:
445
+ - filter: self_attn
446
+ value: 0.21728574423632604
447
+ - filter: mlp
448
+ value: 0.22813107248290188
449
+ - value: 0.1435266378249425
450
+ - layer_range: [12, 14]
451
+ model: uukuguy/speechless-zephyr-code-functionary-7b
452
+ parameters:
453
+ density:
454
+ - filter: self_attn
455
+ value: 0.7579910864422339
456
+ - filter: mlp
457
+ value: 1.0
458
+ - value: 1.0
459
+ weight:
460
+ - filter: self_attn
461
+ value: 0.21526786827735228
462
+ - filter: mlp
463
+ value: 0.19769619474642783
464
+ - value: 0.49420458585638627
465
+ - layer_range: [12, 14]
466
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
467
+ parameters:
468
+ density:
469
+ - filter: self_attn
470
+ value: 0.8379590665264793
471
+ - filter: mlp
472
+ value: 1.0
473
+ - value: 0.6778673543559375
474
+ weight:
475
+ - filter: self_attn
476
+ value: 0.060679858649663874
477
+ - filter: mlp
478
+ value: 0.17248738428562518
479
+ - value: 0.05145640258269078
480
+ - layer_range: [12, 14]
481
+ model: ./Yosegi-0601
482
+ - sources:
483
+ - layer_range: [14, 16]
484
+ model: mistralai/Mistral-7B-v0.3
485
+ parameters:
486
+ density:
487
+ - filter: self_attn
488
+ value: 1.0
489
+ - filter: mlp
490
+ value: 0.8193296716327286
491
+ - value: 0.709644132681917
492
+ weight:
493
+ - filter: self_attn
494
+ value: 0.09821428505487592
495
+ - filter: mlp
496
+ value: 0.0039875777021436964
497
+ - value: 0.27550746634944184
498
+ - layer_range: [14, 16]
499
+ model: meta-math/MetaMath-Mistral-7B
500
+ parameters:
501
+ density:
502
+ - filter: self_attn
503
+ value: 0.9420135087156387
504
+ - filter: mlp
505
+ value: 1.0
506
+ - value: 0.9478569230341948
507
+ weight:
508
+ - filter: self_attn
509
+ value: 0.32640822225239857
510
+ - filter: mlp
511
+ value: 0.28189746971019747
512
+ - value: 0.09777040841174603
513
+ - layer_range: [14, 16]
514
+ model: uukuguy/speechless-zephyr-code-functionary-7b
515
+ parameters:
516
+ density:
517
+ - filter: self_attn
518
+ value: 0.9811539353914964
519
+ - filter: mlp
520
+ value: 1.0
521
+ - value: 0.9947034500579488
522
+ weight:
523
+ - filter: self_attn
524
+ value: 0.015308461456516246
525
+ - filter: mlp
526
+ value: 0.0018966958379955934
527
+ - value: 0.24275389952300747
528
+ - layer_range: [14, 16]
529
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
530
+ parameters:
531
+ density:
532
+ - filter: self_attn
533
+ value: 0.9022355771447704
534
+ - filter: mlp
535
+ value: 1.0
536
+ - value: 1.0
537
+ weight:
538
+ - filter: self_attn
539
+ value: 0.03331841447575224
540
+ - filter: mlp
541
+ value: 0.03561712850019841
542
+ - value: 0.16096143804589919
543
+ - layer_range: [14, 16]
544
+ model: ./Yosegi-0601
545
+ - sources:
546
+ - layer_range: [16, 18]
547
+ model: mistralai/Mistral-7B-v0.3
548
+ parameters:
549
+ density:
550
+ - filter: self_attn
551
+ value: 0.8813466618200871
552
+ - filter: mlp
553
+ value: 1.0
554
+ - value: 1.0
555
+ weight:
556
+ - filter: self_attn
557
+ value: 0.20435001101909528
558
+ - filter: mlp
559
+ value: 0.1516594727144469
560
+ - value: 0.2269819409999868
561
+ - layer_range: [16, 18]
562
+ model: meta-math/MetaMath-Mistral-7B
563
+ parameters:
564
+ density:
565
+ - filter: self_attn
566
+ value: 1.0
567
+ - filter: mlp
568
+ value: 1.0
569
+ - value: 0.8113796412034742
570
+ weight:
571
+ - filter: self_attn
572
+ value: 0.23760349395229585
573
+ - filter: mlp
574
+ value: 0.1725436279774783
575
+ - value: 0.5818814139457673
576
+ - layer_range: [16, 18]
577
+ model: uukuguy/speechless-zephyr-code-functionary-7b
578
+ parameters:
579
+ density:
580
+ - filter: self_attn
581
+ value: 1.0
582
+ - filter: mlp
583
+ value: 0.9307369835995082
584
+ - value: 1.0
585
+ weight:
586
+ - filter: self_attn
587
+ value: 0.0673898519051937
588
+ - filter: mlp
589
+ value: 0.049368399457210624
590
+ - value: 0.2621269048339309
591
+ - layer_range: [16, 18]
592
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
593
+ parameters:
594
+ density:
595
+ - filter: self_attn
596
+ value: 0.8219541044757637
597
+ - filter: mlp
598
+ value: 1.0
599
+ - value: 1.0
600
+ weight:
601
+ - filter: self_attn
602
+ value: 0.21320061393511042
603
+ - filter: mlp
604
+ value: 0.09188781867337345
605
+ - value: 0.27266490524762327
606
+ - layer_range: [16, 18]
607
+ model: ./Yosegi-0601
608
+ - sources:
609
+ - layer_range: [18, 20]
610
+ model: mistralai/Mistral-7B-v0.3
611
+ parameters:
612
+ density:
613
+ - filter: self_attn
614
+ value: 1.0
615
+ - filter: mlp
616
+ value: 0.7993530327131696
617
+ - value: 1.0
618
+ weight:
619
+ - filter: self_attn
620
+ value: 0.20420262433348008
621
+ - filter: mlp
622
+ value: 0.43400570066910155
623
+ - value: 0.13720822682656159
624
+ - layer_range: [18, 20]
625
+ model: meta-math/MetaMath-Mistral-7B
626
+ parameters:
627
+ density:
628
+ - filter: self_attn
629
+ value: 1.0
630
+ - filter: mlp
631
+ value: 1.0
632
+ - value: 0.7035563346885239
633
+ weight:
634
+ - filter: self_attn
635
+ value: 0.3313523263002212
636
+ - filter: mlp
637
+ value: 0.356035051194268
638
+ - value: 0.4742357680522683
639
+ - layer_range: [18, 20]
640
+ model: uukuguy/speechless-zephyr-code-functionary-7b
641
+ parameters:
642
+ density:
643
+ - filter: self_attn
644
+ value: 1.0
645
+ - filter: mlp
646
+ value: 1.0
647
+ - value: 1.0
648
+ weight:
649
+ - filter: self_attn
650
+ value: 0.2475654838180605
651
+ - filter: mlp
652
+ value: 0.35095371882044646
653
+ - value: 0.18536862919946695
654
+ - layer_range: [18, 20]
655
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
656
+ parameters:
657
+ density:
658
+ - filter: self_attn
659
+ value: 1.0
660
+ - filter: mlp
661
+ value: 1.0
662
+ - value: 1.0
663
+ weight:
664
+ - filter: self_attn
665
+ value: 0.02997204931537696
666
+ - filter: mlp
667
+ value: 0.4103581291392323
668
+ - value: 0.19313933251158066
669
+ - layer_range: [18, 20]
670
+ model: ./Yosegi-0601
671
+ - sources:
672
+ - layer_range: [20, 22]
673
+ model: mistralai/Mistral-7B-v0.3
674
+ parameters:
675
+ density:
676
+ - filter: self_attn
677
+ value: 1.0
678
+ - filter: mlp
679
+ value: 1.0
680
+ - value: 0.5321196166337413
681
+ weight:
682
+ - filter: self_attn
683
+ value: 0.17930537920958298
684
+ - filter: mlp
685
+ value: 0.07662274511683252
686
+ - value: 0.1354315278471591
687
+ - layer_range: [20, 22]
688
+ model: meta-math/MetaMath-Mistral-7B
689
+ parameters:
690
+ density:
691
+ - filter: self_attn
692
+ value: 1.0
693
+ - filter: mlp
694
+ value: 0.3768803907042144
695
+ - value: 1.0
696
+ weight:
697
+ - filter: self_attn
698
+ value: 0.1592147705254305
699
+ - filter: mlp
700
+ value: 0.18410207999201075
701
+ - value: 0.4928015910047033
702
+ - layer_range: [20, 22]
703
+ model: uukuguy/speechless-zephyr-code-functionary-7b
704
+ parameters:
705
+ density:
706
+ - filter: self_attn
707
+ value: 1.0
708
+ - filter: mlp
709
+ value: 1.0
710
+ - value: 1.0
711
+ weight:
712
+ - filter: self_attn
713
+ value: 0.37897278298418885
714
+ - filter: mlp
715
+ value: 0.0952591073533606
716
+ - value: 0.03551732810121447
717
+ - layer_range: [20, 22]
718
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
719
+ parameters:
720
+ density:
721
+ - filter: self_attn
722
+ value: 1.0
723
+ - filter: mlp
724
+ value: 1.0
725
+ - value: 1.0
726
+ weight:
727
+ - filter: self_attn
728
+ value: 0.2682334102128691
729
+ - filter: mlp
730
+ value: 0.33485781481395227
731
+ - value: 0.3395139468281392
732
+ - layer_range: [20, 22]
733
+ model: ./Yosegi-0601
734
+ - sources:
735
+ - layer_range: [22, 24]
736
+ model: mistralai/Mistral-7B-v0.3
737
+ parameters:
738
+ density:
739
+ - filter: self_attn
740
+ value: 1.0
741
+ - filter: mlp
742
+ value: 0.8002588203446623
743
+ - value: 1.0
744
+ weight:
745
+ - filter: self_attn
746
+ value: 0.2549204541625693
747
+ - filter: mlp
748
+ value: 0.3722418477156178
749
+ - value: 0.2410463731352089
750
+ - layer_range: [22, 24]
751
+ model: meta-math/MetaMath-Mistral-7B
752
+ parameters:
753
+ density:
754
+ - filter: self_attn
755
+ value: 0.9220873255898425
756
+ - filter: mlp
757
+ value: 1.0
758
+ - value: 1.0
759
+ weight:
760
+ - filter: self_attn
761
+ value: 0.487455295718532
762
+ - filter: mlp
763
+ value: 0.40022413917173594
764
+ - value: 0.17846009757502157
765
+ - layer_range: [22, 24]
766
+ model: uukuguy/speechless-zephyr-code-functionary-7b
767
+ parameters:
768
+ density:
769
+ - filter: self_attn
770
+ value: 0.7696341317318985
771
+ - filter: mlp
772
+ value: 1.0
773
+ - value: 1.0
774
+ weight:
775
+ - filter: self_attn
776
+ value: 0.011267799816515114
777
+ - filter: mlp
778
+ value: 0.5320959832591042
779
+ - value: 0.17095406531325266
780
+ - layer_range: [22, 24]
781
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
782
+ parameters:
783
+ density:
784
+ - filter: self_attn
785
+ value: 1.0
786
+ - filter: mlp
787
+ value: 1.0
788
+ - value: 1.0
789
+ weight:
790
+ - filter: self_attn
791
+ value: 0.556101646343872
792
+ - filter: mlp
793
+ value: 0.5470253909079791
794
+ - value: 0.13241555469863223
795
+ - layer_range: [22, 24]
796
+ model: ./Yosegi-0601
797
+ - sources:
798
+ - layer_range: [24, 26]
799
+ model: mistralai/Mistral-7B-v0.3
800
+ parameters:
801
+ density:
802
+ - filter: self_attn
803
+ value: 0.8667033674916582
804
+ - filter: mlp
805
+ value: 1.0
806
+ - value: 0.9446091486920749
807
+ weight:
808
+ - filter: self_attn
809
+ value: 0.4134110775513897
810
+ - filter: mlp
811
+ value: 0.0181822765943834
812
+ - value: 0.22797659617038232
813
+ - layer_range: [24, 26]
814
+ model: meta-math/MetaMath-Mistral-7B
815
+ parameters:
816
+ density:
817
+ - filter: self_attn
818
+ value: 1.0
819
+ - filter: mlp
820
+ value: 0.9839865829690491
821
+ - value: 0.8252981103449059
822
+ weight:
823
+ - filter: self_attn
824
+ value: 0.3310295320944009
825
+ - filter: mlp
826
+ value: 0.05341478458353629
827
+ - value: 0.3588847186159219
828
+ - layer_range: [24, 26]
829
+ model: uukuguy/speechless-zephyr-code-functionary-7b
830
+ parameters:
831
+ density:
832
+ - filter: self_attn
833
+ value: 1.0
834
+ - filter: mlp
835
+ value: 0.8834823812212265
836
+ - value: 0.8195593509048733
837
+ weight:
838
+ - filter: self_attn
839
+ value: 0.3778012590489552
840
+ - filter: mlp
841
+ value: 0.2553204906819882
842
+ - value: 0.23250565137970108
843
+ - layer_range: [24, 26]
844
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
845
+ parameters:
846
+ density:
847
+ - filter: self_attn
848
+ value: 1.0
849
+ - filter: mlp
850
+ value: 0.7731602497153744
851
+ - value: 0.8647152680789973
852
+ weight:
853
+ - filter: self_attn
854
+ value: 0.1101209698118704
855
+ - filter: mlp
856
+ value: 0.2399169741437055
857
+ - value: 0.32311925187355206
858
+ - layer_range: [24, 26]
859
+ model: ./Yosegi-0601
860
+ - sources:
861
+ - layer_range: [26, 28]
862
+ model: mistralai/Mistral-7B-v0.3
863
+ parameters:
864
+ density:
865
+ - filter: self_attn
866
+ value: 1.0
867
+ - filter: mlp
868
+ value: 0.9508674341172941
869
+ - value: 1.0
870
+ weight:
871
+ - filter: self_attn
872
+ value: 0.4312865186270921
873
+ - filter: mlp
874
+ value: 0.28336325917543326
875
+ - value: 0.051826325177477234
876
+ - layer_range: [26, 28]
877
+ model: meta-math/MetaMath-Mistral-7B
878
+ parameters:
879
+ density:
880
+ - filter: self_attn
881
+ value: 1.0
882
+ - filter: mlp
883
+ value: 0.8945725432745376
884
+ - value: 1.0
885
+ weight:
886
+ - filter: self_attn
887
+ value: 0.03524133636598346
888
+ - filter: mlp
889
+ value: 0.21426126710725438
890
+ - value: 0.31724116335002545
891
+ - layer_range: [26, 28]
892
+ model: uukuguy/speechless-zephyr-code-functionary-7b
893
+ parameters:
894
+ density:
895
+ - filter: self_attn
896
+ value: 1.0
897
+ - filter: mlp
898
+ value: 0.7138130384877139
899
+ - value: 1.0
900
+ weight:
901
+ - filter: self_attn
902
+ value: 0.04890129864608137
903
+ - filter: mlp
904
+ value: 0.3324333287494201
905
+ - value: 0.11533647335498036
906
+ - layer_range: [26, 28]
907
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
908
+ parameters:
909
+ density:
910
+ - filter: self_attn
911
+ value: 1.0
912
+ - filter: mlp
913
+ value: 1.0
914
+ - value: 0.9200281327001997
915
+ weight:
916
+ - filter: self_attn
917
+ value: 0.300842776105564
918
+ - filter: mlp
919
+ value: 0.08363140003203932
920
+ - value: 0.2538677006866867
921
+ - layer_range: [26, 28]
922
+ model: ./Yosegi-0601
923
+ - sources:
924
+ - layer_range: [28, 30]
925
+ model: mistralai/Mistral-7B-v0.3
926
+ parameters:
927
+ density:
928
+ - filter: self_attn
929
+ value: 1.0
930
+ - filter: mlp
931
+ value: 0.7116000185808022
932
+ - value: 1.0
933
+ weight:
934
+ - filter: self_attn
935
+ value: 0.10977758983122704
936
+ - filter: mlp
937
+ value: 0.1839207861311269
938
+ - value: 0.5426174846632369
939
+ - layer_range: [28, 30]
940
+ model: meta-math/MetaMath-Mistral-7B
941
+ parameters:
942
+ density:
943
+ - filter: self_attn
944
+ value: 1.0
945
+ - filter: mlp
946
+ value: 0.8412049419861911
947
+ - value: 1.0
948
+ weight:
949
+ - filter: self_attn
950
+ value: 0.3517232690814979
951
+ - filter: mlp
952
+ value: 0.11878679655495025
953
+ - value: 0.432611353923264
954
+ - layer_range: [28, 30]
955
+ model: uukuguy/speechless-zephyr-code-functionary-7b
956
+ parameters:
957
+ density:
958
+ - filter: self_attn
959
+ value: 1.0
960
+ - filter: mlp
961
+ value: 0.7196182744068202
962
+ - value: 1.0
963
+ weight:
964
+ - filter: self_attn
965
+ value: 0.29848623969081
966
+ - filter: mlp
967
+ value: 0.034661358236493495
968
+ - value: 0.3438376072572394
969
+ - layer_range: [28, 30]
970
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
971
+ parameters:
972
+ density:
973
+ - filter: self_attn
974
+ value: 1.0
975
+ - filter: mlp
976
+ value: 1.0
977
+ - value: 1.0
978
+ weight:
979
+ - filter: self_attn
980
+ value: 0.051511204430449285
981
+ - filter: mlp
982
+ value: 0.3617968383178797
983
+ - value: 0.2578690795635758
984
+ - layer_range: [28, 30]
985
+ model: ./Yosegi-0601
986
+ - sources:
987
+ - layer_range: [30, 32]
988
+ model: mistralai/Mistral-7B-v0.3
989
+ parameters:
990
+ density:
991
+ - filter: self_attn
992
+ value: 0.7971002248466003
993
+ - filter: mlp
994
+ value: 0.8931695149333363
995
+ - value: 1.0
996
+ weight:
997
+ - filter: self_attn
998
+ value: 0.07401430804790136
999
+ - filter: mlp
1000
+ value: 0.00696466997386886
1001
+ - value: 0.08295038526296711
1002
+ - layer_range: [30, 32]
1003
+ model: meta-math/MetaMath-Mistral-7B
1004
+ parameters:
1005
+ density:
1006
+ - filter: self_attn
1007
+ value: 1.0
1008
+ - filter: mlp
1009
+ value: 0.8158777337631619
1010
+ - value: 0.8348784699583887
1011
+ weight:
1012
+ - filter: self_attn
1013
+ value: 0.26799125918248423
1014
+ - filter: mlp
1015
+ value: 0.08176923813129498
1016
+ - value: 0.030317330226146508
1017
+ - layer_range: [30, 32]
1018
+ model: uukuguy/speechless-zephyr-code-functionary-7b
1019
+ parameters:
1020
+ density:
1021
+ - filter: self_attn
1022
+ value: 0.8188850632365792
1023
+ - filter: mlp
1024
+ value: 0.7463831519693573
1025
+ - value: 0.6515317051533988
1026
+ weight:
1027
+ - filter: self_attn
1028
+ value: 0.21122007850953434
1029
+ - filter: mlp
1030
+ value: 0.1463362342258229
1031
+ - value: 0.09176704194956312
1032
+ - layer_range: [30, 32]
1033
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
1034
+ parameters:
1035
+ density:
1036
+ - filter: self_attn
1037
+ value: 0.9313941807354906
1038
+ - filter: mlp
1039
+ value: 1.0
1040
+ - value: 1.0
1041
+ weight:
1042
+ - filter: self_attn
1043
+ value: 0.1443680121177074
1044
+ - filter: mlp
1045
+ value: 0.08309606396368145
1046
+ - value: 0.37059044424517035
1047
+ - layer_range: [30, 32]
1048
+ model: ./Yosegi-0601
1049
+
1050
+ ```
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "./Yosegi-0601",
3
+ "architectures": [
4
+ "MistralForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 14336,
13
+ "max_position_embeddings": 32768,
14
+ "model_type": "mistral",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "num_key_value_heads": 8,
18
+ "rms_norm_eps": 1e-05,
19
+ "rope_theta": 10000.0,
20
+ "sliding_window": 4096,
21
+ "tie_word_embeddings": false,
22
+ "torch_dtype": "bfloat16",
23
+ "transformers_version": "4.41.1",
24
+ "use_cache": true,
25
+ "vocab_size": 32000
26
+ }
mergekit_config.yml ADDED
@@ -0,0 +1,1015 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ base_model: ./Yosegi-0601
2
+ dtype: bfloat16
3
+ merge_method: ties
4
+ parameters:
5
+ int8_mask: 1.0
6
+ normalize: 0.0
7
+ slices:
8
+ - sources:
9
+ - layer_range: [0, 2]
10
+ model: mistralai/Mistral-7B-v0.3
11
+ parameters:
12
+ density:
13
+ - filter: self_attn
14
+ value: 1.0
15
+ - filter: mlp
16
+ value: 0.9895701336232673
17
+ - value: 1.0
18
+ weight:
19
+ - filter: self_attn
20
+ value: 0.5057237984975562
21
+ - filter: mlp
22
+ value: 0.36247235528151495
23
+ - value: 0.0076810835717692014
24
+ - layer_range: [0, 2]
25
+ model: meta-math/MetaMath-Mistral-7B
26
+ parameters:
27
+ density:
28
+ - filter: self_attn
29
+ value: 1.0
30
+ - filter: mlp
31
+ value: 1.0
32
+ - value: 0.8239779346577963
33
+ weight:
34
+ - filter: self_attn
35
+ value: 0.27499287617186813
36
+ - filter: mlp
37
+ value: 0.10579959634086915
38
+ - value: 0.14502290477239704
39
+ - layer_range: [0, 2]
40
+ model: uukuguy/speechless-zephyr-code-functionary-7b
41
+ parameters:
42
+ density:
43
+ - filter: self_attn
44
+ value: 1.0
45
+ - filter: mlp
46
+ value: 0.9654867628269999
47
+ - value: 0.9584724004158125
48
+ weight:
49
+ - filter: self_attn
50
+ value: 0.059719404899177556
51
+ - filter: mlp
52
+ value: 0.1299695859327612
53
+ - value: 0.18821871354400985
54
+ - layer_range: [0, 2]
55
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
56
+ parameters:
57
+ density:
58
+ - filter: self_attn
59
+ value: 0.9322987005873715
60
+ - filter: mlp
61
+ value: 0.8119693860979944
62
+ - value: 0.7800996941956229
63
+ weight:
64
+ - filter: self_attn
65
+ value: 0.14989333734000856
66
+ - filter: mlp
67
+ value: 0.20525182711733667
68
+ - value: 0.0743540962371737
69
+ - layer_range: [0, 2]
70
+ model: ./Yosegi-0601
71
+ - sources:
72
+ - layer_range: [2, 4]
73
+ model: mistralai/Mistral-7B-v0.3
74
+ parameters:
75
+ density:
76
+ - filter: self_attn
77
+ value: 0.6361163471256639
78
+ - filter: mlp
79
+ value: 0.9983948965135213
80
+ - value: 1.0
81
+ weight:
82
+ - filter: self_attn
83
+ value: 0.2433049522842103
84
+ - filter: mlp
85
+ value: 0.11537153133586801
86
+ - value: 0.11236945502439658
87
+ - layer_range: [2, 4]
88
+ model: meta-math/MetaMath-Mistral-7B
89
+ parameters:
90
+ density:
91
+ - filter: self_attn
92
+ value: 1.0
93
+ - filter: mlp
94
+ value: 1.0
95
+ - value: 1.0
96
+ weight:
97
+ - filter: self_attn
98
+ value: 0.13087986863180992
99
+ - filter: mlp
100
+ value: 0.05060452788200992
101
+ - value: 0.029882383396623725
102
+ - layer_range: [2, 4]
103
+ model: uukuguy/speechless-zephyr-code-functionary-7b
104
+ parameters:
105
+ density:
106
+ - filter: self_attn
107
+ value: 0.9938109261305853
108
+ - filter: mlp
109
+ value: 0.709432587913349
110
+ - value: 1.0
111
+ weight:
112
+ - filter: self_attn
113
+ value: 0.15343343058938377
114
+ - filter: mlp
115
+ value: 0.4105917936868785
116
+ - value: 0.6078632204623161
117
+ - layer_range: [2, 4]
118
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
119
+ parameters:
120
+ density:
121
+ - filter: self_attn
122
+ value: 1.0
123
+ - filter: mlp
124
+ value: 1.0
125
+ - value: 0.9634269234020544
126
+ weight:
127
+ - filter: self_attn
128
+ value: 0.03750763360681478
129
+ - filter: mlp
130
+ value: 0.29089122858987404
131
+ - value: 0.3408085857388722
132
+ - layer_range: [2, 4]
133
+ model: ./Yosegi-0601
134
+ - sources:
135
+ - layer_range: [4, 6]
136
+ model: mistralai/Mistral-7B-v0.3
137
+ parameters:
138
+ density:
139
+ - filter: self_attn
140
+ value: 0.8057109303418598
141
+ - filter: mlp
142
+ value: 0.9954520808628292
143
+ - value: 1.0
144
+ weight:
145
+ - filter: self_attn
146
+ value: 0.02598285706585618
147
+ - filter: mlp
148
+ value: 0.06661629726622949
149
+ - value: 0.1285191000066376
150
+ - layer_range: [4, 6]
151
+ model: meta-math/MetaMath-Mistral-7B
152
+ parameters:
153
+ density:
154
+ - filter: self_attn
155
+ value: 0.9112825916608848
156
+ - filter: mlp
157
+ value: 0.9322557507910056
158
+ - value: 1.0
159
+ weight:
160
+ - filter: self_attn
161
+ value: 0.18823564379986454
162
+ - filter: mlp
163
+ value: 0.4552822441636322
164
+ - value: 0.5120525709221785
165
+ - layer_range: [4, 6]
166
+ model: uukuguy/speechless-zephyr-code-functionary-7b
167
+ parameters:
168
+ density:
169
+ - filter: self_attn
170
+ value: 0.9869122169774399
171
+ - filter: mlp
172
+ value: 1.0
173
+ - value: 0.9751291459565757
174
+ weight:
175
+ - filter: self_attn
176
+ value: 0.00493134813843582
177
+ - filter: mlp
178
+ value: 0.3008979965262413
179
+ - value: 0.2528466849993097
180
+ - layer_range: [4, 6]
181
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
182
+ parameters:
183
+ density:
184
+ - filter: self_attn
185
+ value: 1.0
186
+ - filter: mlp
187
+ value: 0.8956512783019246
188
+ - value: 1.0
189
+ weight:
190
+ - filter: self_attn
191
+ value: 0.4197408619693966
192
+ - filter: mlp
193
+ value: 0.1448902874618845
194
+ - value: 0.5196932662212128
195
+ - layer_range: [4, 6]
196
+ model: ./Yosegi-0601
197
+ - sources:
198
+ - layer_range: [6, 8]
199
+ model: mistralai/Mistral-7B-v0.3
200
+ parameters:
201
+ density:
202
+ - filter: self_attn
203
+ value: 1.0
204
+ - filter: mlp
205
+ value: 1.0
206
+ - value: 1.0
207
+ weight:
208
+ - filter: self_attn
209
+ value: 0.05321377226808306
210
+ - filter: mlp
211
+ value: 0.0482589904702303
212
+ - value: 0.433407006546336
213
+ - layer_range: [6, 8]
214
+ model: meta-math/MetaMath-Mistral-7B
215
+ parameters:
216
+ density:
217
+ - filter: self_attn
218
+ value: 0.8300482882633113
219
+ - filter: mlp
220
+ value: 0.8951636861593875
221
+ - value: 1.0
222
+ weight:
223
+ - filter: self_attn
224
+ value: 0.35952608658046414
225
+ - filter: mlp
226
+ value: 0.17385333183950857
227
+ - value: 0.6366514725970246
228
+ - layer_range: [6, 8]
229
+ model: uukuguy/speechless-zephyr-code-functionary-7b
230
+ parameters:
231
+ density:
232
+ - filter: self_attn
233
+ value: 0.7848308077099464
234
+ - filter: mlp
235
+ value: 0.869549457974157
236
+ - value: 1.0
237
+ weight:
238
+ - filter: self_attn
239
+ value: 0.12433943050311849
240
+ - filter: mlp
241
+ value: 0.3065832590226165
242
+ - value: 0.33138948726149514
243
+ - layer_range: [6, 8]
244
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
245
+ parameters:
246
+ density:
247
+ - filter: self_attn
248
+ value: 1.0
249
+ - filter: mlp
250
+ value: 1.0
251
+ - value: 1.0
252
+ weight:
253
+ - filter: self_attn
254
+ value: 0.11885967308786714
255
+ - filter: mlp
256
+ value: 0.29125668567121127
257
+ - value: 0.19251901269486088
258
+ - layer_range: [6, 8]
259
+ model: ./Yosegi-0601
260
+ - sources:
261
+ - layer_range: [8, 10]
262
+ model: mistralai/Mistral-7B-v0.3
263
+ parameters:
264
+ density:
265
+ - filter: self_attn
266
+ value: 1.0
267
+ - filter: mlp
268
+ value: 0.9429625513013793
269
+ - value: 1.0
270
+ weight:
271
+ - filter: self_attn
272
+ value: 0.4085396076816443
273
+ - filter: mlp
274
+ value: 0.038473657720644636
275
+ - value: 0.35014489493395495
276
+ - layer_range: [8, 10]
277
+ model: meta-math/MetaMath-Mistral-7B
278
+ parameters:
279
+ density:
280
+ - filter: self_attn
281
+ value: 1.0
282
+ - filter: mlp
283
+ value: 1.0
284
+ - value: 1.0
285
+ weight:
286
+ - filter: self_attn
287
+ value: 0.26957216810533163
288
+ - filter: mlp
289
+ value: 0.2393300696241166
290
+ - value: 0.4735322427351712
291
+ - layer_range: [8, 10]
292
+ model: uukuguy/speechless-zephyr-code-functionary-7b
293
+ parameters:
294
+ density:
295
+ - filter: self_attn
296
+ value: 0.8594757954447017
297
+ - filter: mlp
298
+ value: 1.0
299
+ - value: 1.0
300
+ weight:
301
+ - filter: self_attn
302
+ value: 0.26101395702355007
303
+ - filter: mlp
304
+ value: 0.3147672140145126
305
+ - value: 0.11658182776184756
306
+ - layer_range: [8, 10]
307
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
308
+ parameters:
309
+ density:
310
+ - filter: self_attn
311
+ value: 1.0
312
+ - filter: mlp
313
+ value: 0.6948062341711919
314
+ - value: 0.9312401427737346
315
+ weight:
316
+ - filter: self_attn
317
+ value: 0.1987774487170517
318
+ - filter: mlp
319
+ value: 0.5628384475763534
320
+ - value: 0.2765378221890683
321
+ - layer_range: [8, 10]
322
+ model: ./Yosegi-0601
323
+ - sources:
324
+ - layer_range: [10, 12]
325
+ model: mistralai/Mistral-7B-v0.3
326
+ parameters:
327
+ density:
328
+ - filter: self_attn
329
+ value: 1.0
330
+ - filter: mlp
331
+ value: 0.8230035654228713
332
+ - value: 1.0
333
+ weight:
334
+ - filter: self_attn
335
+ value: 0.1741591536775035
336
+ - filter: mlp
337
+ value: 0.30563583223301516
338
+ - value: 0.2060419023239155
339
+ - layer_range: [10, 12]
340
+ model: meta-math/MetaMath-Mistral-7B
341
+ parameters:
342
+ density:
343
+ - filter: self_attn
344
+ value: 1.0
345
+ - filter: mlp
346
+ value: 0.9991063013557119
347
+ - value: 1.0
348
+ weight:
349
+ - filter: self_attn
350
+ value: 0.1470996125766866
351
+ - filter: mlp
352
+ value: 0.06646481892400827
353
+ - value: 0.2645489609472036
354
+ - layer_range: [10, 12]
355
+ model: uukuguy/speechless-zephyr-code-functionary-7b
356
+ parameters:
357
+ density:
358
+ - filter: self_attn
359
+ value: 0.6812899560643833
360
+ - filter: mlp
361
+ value: 0.9083104648631823
362
+ - value: 0.9730062683598184
363
+ weight:
364
+ - filter: self_attn
365
+ value: 0.14278507832578724
366
+ - filter: mlp
367
+ value: 0.3475945971407978
368
+ - value: 0.40266546962595284
369
+ - layer_range: [10, 12]
370
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
371
+ parameters:
372
+ density:
373
+ - filter: self_attn
374
+ value: 0.7047231879232164
375
+ - filter: mlp
376
+ value: 0.9148432633716144
377
+ - value: 1.0
378
+ weight:
379
+ - filter: self_attn
380
+ value: 0.15341559366405985
381
+ - filter: mlp
382
+ value: 0.20047704006010095
383
+ - value: 0.17364445581398172
384
+ - layer_range: [10, 12]
385
+ model: ./Yosegi-0601
386
+ - sources:
387
+ - layer_range: [12, 14]
388
+ model: mistralai/Mistral-7B-v0.3
389
+ parameters:
390
+ density:
391
+ - filter: self_attn
392
+ value: 0.6974090973508299
393
+ - filter: mlp
394
+ value: 1.0
395
+ - value: 0.9553573565285324
396
+ weight:
397
+ - filter: self_attn
398
+ value: 0.03614401712451334
399
+ - filter: mlp
400
+ value: 0.1287785039219736
401
+ - value: 0.3780545754310749
402
+ - layer_range: [12, 14]
403
+ model: meta-math/MetaMath-Mistral-7B
404
+ parameters:
405
+ density:
406
+ - filter: self_attn
407
+ value: 0.7857328784783159
408
+ - filter: mlp
409
+ value: 1.0
410
+ - value: 0.6631303877423032
411
+ weight:
412
+ - filter: self_attn
413
+ value: 0.21728574423632604
414
+ - filter: mlp
415
+ value: 0.22813107248290188
416
+ - value: 0.1435266378249425
417
+ - layer_range: [12, 14]
418
+ model: uukuguy/speechless-zephyr-code-functionary-7b
419
+ parameters:
420
+ density:
421
+ - filter: self_attn
422
+ value: 0.7579910864422339
423
+ - filter: mlp
424
+ value: 1.0
425
+ - value: 1.0
426
+ weight:
427
+ - filter: self_attn
428
+ value: 0.21526786827735228
429
+ - filter: mlp
430
+ value: 0.19769619474642783
431
+ - value: 0.49420458585638627
432
+ - layer_range: [12, 14]
433
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
434
+ parameters:
435
+ density:
436
+ - filter: self_attn
437
+ value: 0.8379590665264793
438
+ - filter: mlp
439
+ value: 1.0
440
+ - value: 0.6778673543559375
441
+ weight:
442
+ - filter: self_attn
443
+ value: 0.060679858649663874
444
+ - filter: mlp
445
+ value: 0.17248738428562518
446
+ - value: 0.05145640258269078
447
+ - layer_range: [12, 14]
448
+ model: ./Yosegi-0601
449
+ - sources:
450
+ - layer_range: [14, 16]
451
+ model: mistralai/Mistral-7B-v0.3
452
+ parameters:
453
+ density:
454
+ - filter: self_attn
455
+ value: 1.0
456
+ - filter: mlp
457
+ value: 0.8193296716327286
458
+ - value: 0.709644132681917
459
+ weight:
460
+ - filter: self_attn
461
+ value: 0.09821428505487592
462
+ - filter: mlp
463
+ value: 0.0039875777021436964
464
+ - value: 0.27550746634944184
465
+ - layer_range: [14, 16]
466
+ model: meta-math/MetaMath-Mistral-7B
467
+ parameters:
468
+ density:
469
+ - filter: self_attn
470
+ value: 0.9420135087156387
471
+ - filter: mlp
472
+ value: 1.0
473
+ - value: 0.9478569230341948
474
+ weight:
475
+ - filter: self_attn
476
+ value: 0.32640822225239857
477
+ - filter: mlp
478
+ value: 0.28189746971019747
479
+ - value: 0.09777040841174603
480
+ - layer_range: [14, 16]
481
+ model: uukuguy/speechless-zephyr-code-functionary-7b
482
+ parameters:
483
+ density:
484
+ - filter: self_attn
485
+ value: 0.9811539353914964
486
+ - filter: mlp
487
+ value: 1.0
488
+ - value: 0.9947034500579488
489
+ weight:
490
+ - filter: self_attn
491
+ value: 0.015308461456516246
492
+ - filter: mlp
493
+ value: 0.0018966958379955934
494
+ - value: 0.24275389952300747
495
+ - layer_range: [14, 16]
496
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
497
+ parameters:
498
+ density:
499
+ - filter: self_attn
500
+ value: 0.9022355771447704
501
+ - filter: mlp
502
+ value: 1.0
503
+ - value: 1.0
504
+ weight:
505
+ - filter: self_attn
506
+ value: 0.03331841447575224
507
+ - filter: mlp
508
+ value: 0.03561712850019841
509
+ - value: 0.16096143804589919
510
+ - layer_range: [14, 16]
511
+ model: ./Yosegi-0601
512
+ - sources:
513
+ - layer_range: [16, 18]
514
+ model: mistralai/Mistral-7B-v0.3
515
+ parameters:
516
+ density:
517
+ - filter: self_attn
518
+ value: 0.8813466618200871
519
+ - filter: mlp
520
+ value: 1.0
521
+ - value: 1.0
522
+ weight:
523
+ - filter: self_attn
524
+ value: 0.20435001101909528
525
+ - filter: mlp
526
+ value: 0.1516594727144469
527
+ - value: 0.2269819409999868
528
+ - layer_range: [16, 18]
529
+ model: meta-math/MetaMath-Mistral-7B
530
+ parameters:
531
+ density:
532
+ - filter: self_attn
533
+ value: 1.0
534
+ - filter: mlp
535
+ value: 1.0
536
+ - value: 0.8113796412034742
537
+ weight:
538
+ - filter: self_attn
539
+ value: 0.23760349395229585
540
+ - filter: mlp
541
+ value: 0.1725436279774783
542
+ - value: 0.5818814139457673
543
+ - layer_range: [16, 18]
544
+ model: uukuguy/speechless-zephyr-code-functionary-7b
545
+ parameters:
546
+ density:
547
+ - filter: self_attn
548
+ value: 1.0
549
+ - filter: mlp
550
+ value: 0.9307369835995082
551
+ - value: 1.0
552
+ weight:
553
+ - filter: self_attn
554
+ value: 0.0673898519051937
555
+ - filter: mlp
556
+ value: 0.049368399457210624
557
+ - value: 0.2621269048339309
558
+ - layer_range: [16, 18]
559
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
560
+ parameters:
561
+ density:
562
+ - filter: self_attn
563
+ value: 0.8219541044757637
564
+ - filter: mlp
565
+ value: 1.0
566
+ - value: 1.0
567
+ weight:
568
+ - filter: self_attn
569
+ value: 0.21320061393511042
570
+ - filter: mlp
571
+ value: 0.09188781867337345
572
+ - value: 0.27266490524762327
573
+ - layer_range: [16, 18]
574
+ model: ./Yosegi-0601
575
+ - sources:
576
+ - layer_range: [18, 20]
577
+ model: mistralai/Mistral-7B-v0.3
578
+ parameters:
579
+ density:
580
+ - filter: self_attn
581
+ value: 1.0
582
+ - filter: mlp
583
+ value: 0.7993530327131696
584
+ - value: 1.0
585
+ weight:
586
+ - filter: self_attn
587
+ value: 0.20420262433348008
588
+ - filter: mlp
589
+ value: 0.43400570066910155
590
+ - value: 0.13720822682656159
591
+ - layer_range: [18, 20]
592
+ model: meta-math/MetaMath-Mistral-7B
593
+ parameters:
594
+ density:
595
+ - filter: self_attn
596
+ value: 1.0
597
+ - filter: mlp
598
+ value: 1.0
599
+ - value: 0.7035563346885239
600
+ weight:
601
+ - filter: self_attn
602
+ value: 0.3313523263002212
603
+ - filter: mlp
604
+ value: 0.356035051194268
605
+ - value: 0.4742357680522683
606
+ - layer_range: [18, 20]
607
+ model: uukuguy/speechless-zephyr-code-functionary-7b
608
+ parameters:
609
+ density:
610
+ - filter: self_attn
611
+ value: 1.0
612
+ - filter: mlp
613
+ value: 1.0
614
+ - value: 1.0
615
+ weight:
616
+ - filter: self_attn
617
+ value: 0.2475654838180605
618
+ - filter: mlp
619
+ value: 0.35095371882044646
620
+ - value: 0.18536862919946695
621
+ - layer_range: [18, 20]
622
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
623
+ parameters:
624
+ density:
625
+ - filter: self_attn
626
+ value: 1.0
627
+ - filter: mlp
628
+ value: 1.0
629
+ - value: 1.0
630
+ weight:
631
+ - filter: self_attn
632
+ value: 0.02997204931537696
633
+ - filter: mlp
634
+ value: 0.4103581291392323
635
+ - value: 0.19313933251158066
636
+ - layer_range: [18, 20]
637
+ model: ./Yosegi-0601
638
+ - sources:
639
+ - layer_range: [20, 22]
640
+ model: mistralai/Mistral-7B-v0.3
641
+ parameters:
642
+ density:
643
+ - filter: self_attn
644
+ value: 1.0
645
+ - filter: mlp
646
+ value: 1.0
647
+ - value: 0.5321196166337413
648
+ weight:
649
+ - filter: self_attn
650
+ value: 0.17930537920958298
651
+ - filter: mlp
652
+ value: 0.07662274511683252
653
+ - value: 0.1354315278471591
654
+ - layer_range: [20, 22]
655
+ model: meta-math/MetaMath-Mistral-7B
656
+ parameters:
657
+ density:
658
+ - filter: self_attn
659
+ value: 1.0
660
+ - filter: mlp
661
+ value: 0.3768803907042144
662
+ - value: 1.0
663
+ weight:
664
+ - filter: self_attn
665
+ value: 0.1592147705254305
666
+ - filter: mlp
667
+ value: 0.18410207999201075
668
+ - value: 0.4928015910047033
669
+ - layer_range: [20, 22]
670
+ model: uukuguy/speechless-zephyr-code-functionary-7b
671
+ parameters:
672
+ density:
673
+ - filter: self_attn
674
+ value: 1.0
675
+ - filter: mlp
676
+ value: 1.0
677
+ - value: 1.0
678
+ weight:
679
+ - filter: self_attn
680
+ value: 0.37897278298418885
681
+ - filter: mlp
682
+ value: 0.0952591073533606
683
+ - value: 0.03551732810121447
684
+ - layer_range: [20, 22]
685
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
686
+ parameters:
687
+ density:
688
+ - filter: self_attn
689
+ value: 1.0
690
+ - filter: mlp
691
+ value: 1.0
692
+ - value: 1.0
693
+ weight:
694
+ - filter: self_attn
695
+ value: 0.2682334102128691
696
+ - filter: mlp
697
+ value: 0.33485781481395227
698
+ - value: 0.3395139468281392
699
+ - layer_range: [20, 22]
700
+ model: ./Yosegi-0601
701
+ - sources:
702
+ - layer_range: [22, 24]
703
+ model: mistralai/Mistral-7B-v0.3
704
+ parameters:
705
+ density:
706
+ - filter: self_attn
707
+ value: 1.0
708
+ - filter: mlp
709
+ value: 0.8002588203446623
710
+ - value: 1.0
711
+ weight:
712
+ - filter: self_attn
713
+ value: 0.2549204541625693
714
+ - filter: mlp
715
+ value: 0.3722418477156178
716
+ - value: 0.2410463731352089
717
+ - layer_range: [22, 24]
718
+ model: meta-math/MetaMath-Mistral-7B
719
+ parameters:
720
+ density:
721
+ - filter: self_attn
722
+ value: 0.9220873255898425
723
+ - filter: mlp
724
+ value: 1.0
725
+ - value: 1.0
726
+ weight:
727
+ - filter: self_attn
728
+ value: 0.487455295718532
729
+ - filter: mlp
730
+ value: 0.40022413917173594
731
+ - value: 0.17846009757502157
732
+ - layer_range: [22, 24]
733
+ model: uukuguy/speechless-zephyr-code-functionary-7b
734
+ parameters:
735
+ density:
736
+ - filter: self_attn
737
+ value: 0.7696341317318985
738
+ - filter: mlp
739
+ value: 1.0
740
+ - value: 1.0
741
+ weight:
742
+ - filter: self_attn
743
+ value: 0.011267799816515114
744
+ - filter: mlp
745
+ value: 0.5320959832591042
746
+ - value: 0.17095406531325266
747
+ - layer_range: [22, 24]
748
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
749
+ parameters:
750
+ density:
751
+ - filter: self_attn
752
+ value: 1.0
753
+ - filter: mlp
754
+ value: 1.0
755
+ - value: 1.0
756
+ weight:
757
+ - filter: self_attn
758
+ value: 0.556101646343872
759
+ - filter: mlp
760
+ value: 0.5470253909079791
761
+ - value: 0.13241555469863223
762
+ - layer_range: [22, 24]
763
+ model: ./Yosegi-0601
764
+ - sources:
765
+ - layer_range: [24, 26]
766
+ model: mistralai/Mistral-7B-v0.3
767
+ parameters:
768
+ density:
769
+ - filter: self_attn
770
+ value: 0.8667033674916582
771
+ - filter: mlp
772
+ value: 1.0
773
+ - value: 0.9446091486920749
774
+ weight:
775
+ - filter: self_attn
776
+ value: 0.4134110775513897
777
+ - filter: mlp
778
+ value: 0.0181822765943834
779
+ - value: 0.22797659617038232
780
+ - layer_range: [24, 26]
781
+ model: meta-math/MetaMath-Mistral-7B
782
+ parameters:
783
+ density:
784
+ - filter: self_attn
785
+ value: 1.0
786
+ - filter: mlp
787
+ value: 0.9839865829690491
788
+ - value: 0.8252981103449059
789
+ weight:
790
+ - filter: self_attn
791
+ value: 0.3310295320944009
792
+ - filter: mlp
793
+ value: 0.05341478458353629
794
+ - value: 0.3588847186159219
795
+ - layer_range: [24, 26]
796
+ model: uukuguy/speechless-zephyr-code-functionary-7b
797
+ parameters:
798
+ density:
799
+ - filter: self_attn
800
+ value: 1.0
801
+ - filter: mlp
802
+ value: 0.8834823812212265
803
+ - value: 0.8195593509048733
804
+ weight:
805
+ - filter: self_attn
806
+ value: 0.3778012590489552
807
+ - filter: mlp
808
+ value: 0.2553204906819882
809
+ - value: 0.23250565137970108
810
+ - layer_range: [24, 26]
811
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
812
+ parameters:
813
+ density:
814
+ - filter: self_attn
815
+ value: 1.0
816
+ - filter: mlp
817
+ value: 0.7731602497153744
818
+ - value: 0.8647152680789973
819
+ weight:
820
+ - filter: self_attn
821
+ value: 0.1101209698118704
822
+ - filter: mlp
823
+ value: 0.2399169741437055
824
+ - value: 0.32311925187355206
825
+ - layer_range: [24, 26]
826
+ model: ./Yosegi-0601
827
+ - sources:
828
+ - layer_range: [26, 28]
829
+ model: mistralai/Mistral-7B-v0.3
830
+ parameters:
831
+ density:
832
+ - filter: self_attn
833
+ value: 1.0
834
+ - filter: mlp
835
+ value: 0.9508674341172941
836
+ - value: 1.0
837
+ weight:
838
+ - filter: self_attn
839
+ value: 0.4312865186270921
840
+ - filter: mlp
841
+ value: 0.28336325917543326
842
+ - value: 0.051826325177477234
843
+ - layer_range: [26, 28]
844
+ model: meta-math/MetaMath-Mistral-7B
845
+ parameters:
846
+ density:
847
+ - filter: self_attn
848
+ value: 1.0
849
+ - filter: mlp
850
+ value: 0.8945725432745376
851
+ - value: 1.0
852
+ weight:
853
+ - filter: self_attn
854
+ value: 0.03524133636598346
855
+ - filter: mlp
856
+ value: 0.21426126710725438
857
+ - value: 0.31724116335002545
858
+ - layer_range: [26, 28]
859
+ model: uukuguy/speechless-zephyr-code-functionary-7b
860
+ parameters:
861
+ density:
862
+ - filter: self_attn
863
+ value: 1.0
864
+ - filter: mlp
865
+ value: 0.7138130384877139
866
+ - value: 1.0
867
+ weight:
868
+ - filter: self_attn
869
+ value: 0.04890129864608137
870
+ - filter: mlp
871
+ value: 0.3324333287494201
872
+ - value: 0.11533647335498036
873
+ - layer_range: [26, 28]
874
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
875
+ parameters:
876
+ density:
877
+ - filter: self_attn
878
+ value: 1.0
879
+ - filter: mlp
880
+ value: 1.0
881
+ - value: 0.9200281327001997
882
+ weight:
883
+ - filter: self_attn
884
+ value: 0.300842776105564
885
+ - filter: mlp
886
+ value: 0.08363140003203932
887
+ - value: 0.2538677006866867
888
+ - layer_range: [26, 28]
889
+ model: ./Yosegi-0601
890
+ - sources:
891
+ - layer_range: [28, 30]
892
+ model: mistralai/Mistral-7B-v0.3
893
+ parameters:
894
+ density:
895
+ - filter: self_attn
896
+ value: 1.0
897
+ - filter: mlp
898
+ value: 0.7116000185808022
899
+ - value: 1.0
900
+ weight:
901
+ - filter: self_attn
902
+ value: 0.10977758983122704
903
+ - filter: mlp
904
+ value: 0.1839207861311269
905
+ - value: 0.5426174846632369
906
+ - layer_range: [28, 30]
907
+ model: meta-math/MetaMath-Mistral-7B
908
+ parameters:
909
+ density:
910
+ - filter: self_attn
911
+ value: 1.0
912
+ - filter: mlp
913
+ value: 0.8412049419861911
914
+ - value: 1.0
915
+ weight:
916
+ - filter: self_attn
917
+ value: 0.3517232690814979
918
+ - filter: mlp
919
+ value: 0.11878679655495025
920
+ - value: 0.432611353923264
921
+ - layer_range: [28, 30]
922
+ model: uukuguy/speechless-zephyr-code-functionary-7b
923
+ parameters:
924
+ density:
925
+ - filter: self_attn
926
+ value: 1.0
927
+ - filter: mlp
928
+ value: 0.7196182744068202
929
+ - value: 1.0
930
+ weight:
931
+ - filter: self_attn
932
+ value: 0.29848623969081
933
+ - filter: mlp
934
+ value: 0.034661358236493495
935
+ - value: 0.3438376072572394
936
+ - layer_range: [28, 30]
937
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
938
+ parameters:
939
+ density:
940
+ - filter: self_attn
941
+ value: 1.0
942
+ - filter: mlp
943
+ value: 1.0
944
+ - value: 1.0
945
+ weight:
946
+ - filter: self_attn
947
+ value: 0.051511204430449285
948
+ - filter: mlp
949
+ value: 0.3617968383178797
950
+ - value: 0.2578690795635758
951
+ - layer_range: [28, 30]
952
+ model: ./Yosegi-0601
953
+ - sources:
954
+ - layer_range: [30, 32]
955
+ model: mistralai/Mistral-7B-v0.3
956
+ parameters:
957
+ density:
958
+ - filter: self_attn
959
+ value: 0.7971002248466003
960
+ - filter: mlp
961
+ value: 0.8931695149333363
962
+ - value: 1.0
963
+ weight:
964
+ - filter: self_attn
965
+ value: 0.07401430804790136
966
+ - filter: mlp
967
+ value: 0.00696466997386886
968
+ - value: 0.08295038526296711
969
+ - layer_range: [30, 32]
970
+ model: meta-math/MetaMath-Mistral-7B
971
+ parameters:
972
+ density:
973
+ - filter: self_attn
974
+ value: 1.0
975
+ - filter: mlp
976
+ value: 0.8158777337631619
977
+ - value: 0.8348784699583887
978
+ weight:
979
+ - filter: self_attn
980
+ value: 0.26799125918248423
981
+ - filter: mlp
982
+ value: 0.08176923813129498
983
+ - value: 0.030317330226146508
984
+ - layer_range: [30, 32]
985
+ model: uukuguy/speechless-zephyr-code-functionary-7b
986
+ parameters:
987
+ density:
988
+ - filter: self_attn
989
+ value: 0.8188850632365792
990
+ - filter: mlp
991
+ value: 0.7463831519693573
992
+ - value: 0.6515317051533988
993
+ weight:
994
+ - filter: self_attn
995
+ value: 0.21122007850953434
996
+ - filter: mlp
997
+ value: 0.1463362342258229
998
+ - value: 0.09176704194956312
999
+ - layer_range: [30, 32]
1000
+ model: ./Ninja-v1-RP-expressive-v2-LoRA
1001
+ parameters:
1002
+ density:
1003
+ - filter: self_attn
1004
+ value: 0.9313941807354906
1005
+ - filter: mlp
1006
+ value: 1.0
1007
+ - value: 1.0
1008
+ weight:
1009
+ - filter: self_attn
1010
+ value: 0.1443680121177074
1011
+ - filter: mlp
1012
+ value: 0.08309606396368145
1013
+ - value: 0.37059044424517035
1014
+ - layer_range: [30, 32]
1015
+ model: ./Yosegi-0601
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c76f3665a788a49dac074ef36082a7730d76d45eb491a2d44560c4b7af61c6f
3
+ size 4886547008
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff84a0a9e62b5cd192df4f11dde4ab5dd63e4ca39ab1398c7626ed9234722665
3
+ size 4915916176
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73358c2ea83bf22d73d195d8385436364195060dd0dcde99c92742832ddc57cd
3
+ size 4681034848
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {"mergekit_version": "0.0.4.2", "total_size": 14483464192}, "weight_map": {"lm_head.weight": "model-00001-of-00003.safetensors", "model.embed_tokens.weight": "model-00001-of-00003.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00001-of-00003.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00001-of-00003.safetensors", "model.layers.18.input_layernorm.weight": "model-00001-of-00003.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00003.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00003.safetensors", "model.layers.28.input_layernorm.weight": "model-00002-of-00003.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00002-of-00003.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.input_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00003-of-00003.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00003-of-00003.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00003-of-00003.safetensors", "model.norm.weight": "model-00003-of-00003.safetensors"}}
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "unk_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ }
29
+ },
30
+ "additional_special_tokens": [],
31
+ "bos_token": "<s>",
32
+ "clean_up_tokenization_spaces": false,
33
+ "eos_token": "</s>",
34
+ "legacy": true,
35
+ "model_max_length": 1000000000000000019884624838656,
36
+ "pad_token": null,
37
+ "padding_side": "left",
38
+ "sp_model_kwargs": {},
39
+ "spaces_between_special_tokens": false,
40
+ "split_special_tokens": false,
41
+ "tokenizer_class": "LlamaTokenizer",
42
+ "unk_token": "<unk>",
43
+ "use_default_system_prompt": true
44
+ }