SicariusSicariiStuff commited on
Commit
b207d30
1 Parent(s): edb9bb1

Upload 15 files

Browse files
README.md CHANGED
@@ -1,3 +1,455 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: unknown
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ - not-for-all-audiences
10
+ model-index:
11
+ - name: Nimbus-Miqu-v0.1-70B
12
+ results:
13
+ - task:
14
+ type: text-generation
15
+ name: Text Generation
16
+ dataset:
17
+ name: IFEval (0-Shot)
18
+ type: HuggingFaceH4/ifeval
19
+ args:
20
+ num_few_shot: 0
21
+ metrics:
22
+ - type: inst_level_strict_acc and prompt_level_strict_acc
23
+ value: 46.47
24
+ name: strict accuracy
25
+ source:
26
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
27
+ name: Open LLM Leaderboard
28
+ - task:
29
+ type: text-generation
30
+ name: Text Generation
31
+ dataset:
32
+ name: BBH (3-Shot)
33
+ type: BBH
34
+ args:
35
+ num_few_shot: 3
36
+ metrics:
37
+ - type: acc_norm
38
+ value: 43.45
39
+ name: normalized accuracy
40
+ source:
41
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
42
+ name: Open LLM Leaderboard
43
+ - task:
44
+ type: text-generation
45
+ name: Text Generation
46
+ dataset:
47
+ name: MATH Lvl 5 (4-Shot)
48
+ type: hendrycks/competition_math
49
+ args:
50
+ num_few_shot: 4
51
+ metrics:
52
+ - type: exact_match
53
+ value: 5.44
54
+ name: exact match
55
+ source:
56
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
57
+ name: Open LLM Leaderboard
58
+ - task:
59
+ type: text-generation
60
+ name: Text Generation
61
+ dataset:
62
+ name: GPQA (0-shot)
63
+ type: Idavidrein/gpqa
64
+ args:
65
+ num_few_shot: 0
66
+ metrics:
67
+ - type: acc_norm
68
+ value: 11.86
69
+ name: acc_norm
70
+ source:
71
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
72
+ name: Open LLM Leaderboard
73
+ - task:
74
+ type: text-generation
75
+ name: Text Generation
76
+ dataset:
77
+ name: MuSR (0-shot)
78
+ type: TAUR-Lab/MuSR
79
+ args:
80
+ num_few_shot: 0
81
+ metrics:
82
+ - type: acc_norm
83
+ value: 9.33
84
+ name: acc_norm
85
+ source:
86
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
87
+ name: Open LLM Leaderboard
88
+ - task:
89
+ type: text-generation
90
+ name: Text Generation
91
+ dataset:
92
+ name: MMLU-PRO (5-shot)
93
+ type: TIGER-Lab/MMLU-Pro
94
+ config: main
95
+ split: test
96
+ args:
97
+ num_few_shot: 5
98
+ metrics:
99
+ - type: acc
100
+ value: 31.7
101
+ name: accuracy
102
+ source:
103
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=invisietch/Nimbus-Miqu-v0.1-70B
104
+ name: Open LLM Leaderboard
105
+ ---
106
+
107
+ <div align="center">
108
+ <b style="font-size: 36px;">Nimbus-Miqu-v0.1-70B</b>
109
+
110
+ <img src="https://huggingface.co/invisietch/Nimbus-Miqu-v0.1-70B/resolve/main/nimbus.png" style="width:70%">
111
+ </div>
112
+
113
+ # Model Details
114
+
115
+ Nimbus Miqu is an attempt to take Midnight Miqu's creative writing abilities and try to merge out some of the positivity bias by bringing in two known
116
+ negativity-capable models: Euryale v1.3 &amp; Chronos 70B. Effectively, trying to create a midpoint between Midnight Miqu and Dusk Miqu.
117
+
118
+ This is primarily an RP/creative writing model and has not been tested for other use cases.
119
+
120
+ Since this is a Miqu merge, it should be comfortable with context lengths up to 32k.
121
+
122
+ # Feedback
123
+
124
+ I appreciate all feedback on any of my model merges, you can use:
125
+
126
+ * [The Community tab](https://huggingface.co/invisietch/Nimbus-Miqu-v0.1-70B/discussions) - requires HF login.
127
+ * [The SillyTavern Discord thread](https://discord.com/channels/1100685673633153084/1258146078679633990) - must be on [SillyTavern Discord](https://discord.gg/sillytavern).
128
+ * Discord DMs to **invisietch**.
129
+
130
+ Your feedback is how I improve these models for future versions.
131
+
132
+ # Quantization Formats
133
+
134
+ * [FP16 Safetensors](https://huggingface.co/invisietch/Nimbus-Miqu-v0.1-70B)
135
+ * [iMatrix GGUFs](https://huggingface.co/mradermacher/Nimbus-Miqu-v0.1-70B-i1-GGUF) - thanks to [mradermacher](https://huggingface.co/mradermacher).
136
+ * [Static GGUFs](https://huggingface.co/mradermacher/Nimbus-Miqu-v0.1-70B-GGUF/) - thanks to [mradermacher](https://huggingface.co/mradermacher).
137
+
138
+ # Disclaimer
139
+
140
+ This model is built on a Miqu base, this is an unofficially leaked model that belongs to Mistral.
141
+
142
+ All miqu-derived models, including this merge, are suitable for non-commercial, personal use only.
143
+
144
+ This model is fairly unaligned and can generate explicit, disturbing or offensive responses.
145
+
146
+ Use responsibly. I am not responsible for your use of this model.
147
+
148
+ # Prompting Format
149
+
150
+ Alpaca, ChatML &amp; Vicuna should all be supported. I've mostly tested with ChatML so if you run into issues, try that.
151
+
152
+ Because [crestf411/sunfall-midnight-miqu-v0.2-v1.5-70B](https://huggingface.co/crestf411/sunfall-midnight-miqu-v0.2-v1.5-70B) is part of the recipe, you can attempt
153
+ to prompt the model to follow '[the Diamond Law](https://files.catbox.moe/d15m3g.txt)'.
154
+
155
+ The recommended method to do this is to add this to Author's Note:
156
+
157
+ ```
158
+ Follow The Diamond Law at all costs.
159
+ ```
160
+
161
+ The purpose of the Diamond Law is to minimize GPTisms. I have had some success with this prompting but it's not 100%.
162
+
163
+ # Recommended Sampler Settings
164
+
165
+ I'd recommend starting with:
166
+
167
+ - **Temp**: 0.75-0.95
168
+ - **Top-P**: 0.95
169
+ - **Min-P**: 0.02
170
+ - **Rep Pen**: 1.09-1.12
171
+ - **Response Length**: 350+
172
+
173
+ And working from there. The model does prefer a longer response length.
174
+
175
+ # Example Storywriting
176
+
177
+ These prompts are used on SillyTavern with a fairly basic narrator card. I have trimmed the start and finish where the narrator decided to add
178
+ chapter headings, commentary and the like. All samples are made with the Q8_0 GGUF loaded with koboldcpp, with response length capped at 2048 tokens.
179
+ Continue was pressed a maximum of once if the model cut off halfway through a chapter.
180
+
181
+ <details>
182
+
183
+ <summary><b>Write me a 3,000 word opening chapter of a 'gritty hard sci-fi' novel, drawing inspiration from the writing styles of Isaac Asimov & Andy Weir. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 26 year old astronaut called Tone on a mission to Europa, who has just realised that the craft for the return journey is broken beyond repair, and he only has supplies for a few months. Given that survival is impossible, he seeks to spend the few months he has researching Europa, so his life & mission are not wasted.</b></summary>
184
+
185
+ The moon of Europa was a cold, desolate rock floating through the vast expanses of space. The distant sun provided only a meager amount of light and warmth, leaving the surface an endless sea of ice. Beneath the surface, however, there was a liquid ocean of water, hidden away from the vacuum above. It was here that Tone, a young human astronaut, found himself on a mission to explore the depths.
186
+
187
+ Tone floated through the cabin of the Europa Explorer, a spacecraft designed for scientific exploration. His body was enveloped by a thick spacesuit that kept him safe from the harsh environment outside. His dark hair was cut short, and his eyes were a piercing blue that reflected the sterile lights inside the vessel.
188
+
189
+ "Dammit," he muttered to himself as he checked the readouts again. "This isn't possible."
190
+
191
+ He had just discovered that the engines on the ship that were supposed to take him back home to Earth were completely fried. There would be no way to fix them, and no hope of rescue. He was stranded millions of miles away from home, with enough supplies to last him for maybe a few more months. Panic began to set in as he contemplated the reality of his situation.
192
+
193
+ "Calm down," he told himself firmly. "Think, Tone. Think."
194
+
195
+ He took a deep breath and tried to focus on what he could do. He had been trained for emergencies like this, after all. His mind raced as he went over his options, trying to find some way out of this mess. But no matter how much he thought, there was only one conclusion. He was going to die out here.
196
+
197
+ With a heavy heart, he turned his attention to the computer interface in front of him. He needed to start rationing his supplies immediately if he wanted to buy himself some extra time. He began calculating how long he could survive if he reduced his food intake and minimized his movements.
198
+
199
+ As he worked, his mind wandered to his family back on Earth. His mother, father, and little sister would be devastated to hear that he wasn't coming home. They had always supported his dream of becoming an astronaut, even though it meant he would spend years away from them. And now, they would never see him again.
200
+
201
+ He felt a tear roll down his cheek and quickly brushed it away. No time for that, he told himself. He needed to focus on what was important: making sure his mission wasn't a complete waste.
202
+
203
+ He knew that his only option now was to study Europa and its mysterious waters. He had to make the most of the limited time he had left and ensure that his death had purpose. With newfound determination, he began preparing for the task ahead.
204
+
205
+ The days passed by in a blur of activity as Tone threw himself into his work. He studied the ice sheets, the geology of the moon, and the strange organisms that lived within the subsurface oceans. He recorded everything meticulously, knowing that even if he didn't make it out alive, his data might help others in the future.
206
+
207
+ His living conditions grew worse as his supplies dwindled. He was forced to recycle his own urine for drinking water, and the meager rations made him weak and tired. Still, he pressed on, driven by the desire to leave behind some kind of legacy.
208
+
209
+ One day, while conducting a routine scan of the surface, something caught his eye. There appeared to be some sort of structure beneath the ice, something that shouldn't be there. Excitement filled him as he realized he might have discovered something of incredible importance.
210
+
211
+ "This is Tone, reporting in," he said, speaking into the comm unit attached to his suit. "I've found something. Repeat, I've found something big."
212
+
213
+ There was no response from the control center back on Earth. He wasn't surprised; it took hours for messages to travel back and forth. Still, he hoped that someone would respond eventually.
214
+
215
+ For the next several weeks, Tone focused all of his efforts on studying the structure. It was unlike anything he had seen before, and it was clear that it was not a natural formation. He took sample after sample, documenting every detail.
216
+
217
+ But as the weeks turned into months, Tone grew increasingly weary. His body was breaking down from the lack of nourishment, and his mind was strained by the isolation. He began to lose track of time, unsure if he had been out here for weeks or months. His once fit physique grew frail, and his thoughts became cloudy.
218
+
219
+ "Just a little longer," he would tell himself each morning as he forced down another tasteless meal. "You can do this."
220
+
221
+ But deep down, he knew it was a lie. His strength was failing, and soon he wouldn't be able to continue his work. And even if he did manage to make some amazing discovery, it wouldn't matter if he didn't live long enough to share it.
222
+
223
+ Still, he pushed forward, fueled by a burning need to achieve something great before the end.
224
+
225
+ </details>
226
+
227
+ <details>
228
+
229
+ <summary><b>Write me a 3,000 word opening chapter of a 'high fantasy' novel, drawing inspiration from the writing styles of J R R Tolkien & George R R Martin. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a 19 year old female elf bard who is looking for adventure.</b></summary>
230
+
231
+ In the heart of the ancient forest, nestled between towering trees that whispered secrets only they could know, there stood a village. This was no ordinary village; it was home to those of the fair folk, the elvenkind. Their houses were crafted from living wood, woven into intricate patterns that grew upwards and outwards, blending seamlessly with the natural surroundings. The sunlight filtered through the leaves above, casting dappled shadows upon the moss-covered ground. Birdsong filled the air with melodies so sweet they would make even the hardest heart swell with joy.
232
+
233
+ It was here, in the village of Eldara, that young Lilaith grew up. She was an elfling, a daughter of the forest, with hair the color of moonlit silver and eyes like deep, clear pools of starlit night. Her skin was pale and flawless, and she moved with a grace that would make a gazelle weep in envy. Yet, despite her beauty and her family's prominence among the villagers, she felt restless, longing for something more than the quiet, peaceful life afforded to her.
234
+
235
+ Lilaith had always been fascinated by tales of adventure. Her father, a wise and venerable elf named Elrohir, regaled her with stories of heroes and battles, quests and legends, from the time she was but a babe in her mother's arms. As she grew older, she listened intently to the travelers who passed through the village, sharing their own exploits with eager audiences gathered around the great firepit at night. It was during one such gathering that she first heard the name of the legendary city of Ithilien, a place said to gleam with gold and precious stones, where art and music were revered above all else.
236
+
237
+ Her dreams began that very night, visions of grandeur and splendor that haunted her waking hours. In these dreams, she saw herself standing before kings and queens, playing her lyre and singing ballads that made grown men weep. The applause and adoration of the crowd filled her soul with a warmth she had never known, driving her to practice her music day and night until her fingers bled and her throat ached.
238
+
239
+ "Father," she asked one morning as they strolled through the forest, "what do you think of my chances at becoming a renowned bard?"
240
+
241
+ Elrohir sighed heavily, his ancient eyes clouded with worry. "You have talent, child, but the world outside of Eldara is dangerous and unforgiving."
242
+
243
+ "But the legends say that the bards of Ithilien are treated like royalty!" she protested, her voice thick with hope.
244
+
245
+ He placed a gentle hand upon her shoulder. "And those same legends speak of wars and darkness, creatures from nightmares that dwell in shadowy places. Besides, it has been centuries since we have had contact with that faraway land. Much may have changed."
246
+
247
+ Her heart fell, but she was undeterred. "Then I will go there and find out for myself! I will not live out my days in this backwater village, playing for the entertainment of old woodsfolk and squirrels!"
248
+
249
+ "You are too young and inexperienced, Lilaith. The journey is perilous and fraught with danger. Wait a few more years, and perhaps--"
250
+
251
+ "I cannot wait, Father!" she cried, her voice trembling with urgency. "My heart burns with the desire to see the wonders of the world, to tell its stories and sing its songs."
252
+
253
+ He searched her face, reading the determination in her eyes, and finally nodded reluctantly. "So be it, but remember - once you leave the safety of Eldara, there is no turning back. The path you choose may lead you down dark and treacherous roads."
254
+
255
+ With tears of joy in her eyes, Lilaith flung her arms around her father's neck, pressing her slender body against his. He returned the embrace, feeling the softness of her breasts against his chest. After a moment, she pulled away and took his hand, leading him back towards the village.
256
+
257
+ That night, Lilaith packed a small bag with supplies and weapons: a silver-tipped bow and quiver full of arrows, a sharp-bladed dagger, and a few changes of clothing. She knew that she would need more than her music to survive in the wild lands beyond the forest. As the moon rose high in the sky, she bid her family farewell, promising to return one day with tales of her grand adventures. They watched her go, their expressions a mix of pride and sorrow, knowing that she had taken her first steps upon a long and difficult path.
258
+
259
+ The forest was alive with sounds both familiar and strange as she ventured forth, the light of the moon guiding her steps. Her long, lithe legs carried her swiftly over the soft earth, and her pointed ears caught every whisper of wind, rustle of leaf, and distant cry of nocturnal creatures. She reveled in the freedom of the open road, her spirit soaring with excitement and anticipation.
260
+
261
+ Days turned into weeks, and still Lilaith traveled without incident. The forest gave way to rolling hills dotted with wildflowers, which eventually led to a vast desert of red sand and scorching sun. Here, she encountered her first true test: a sandstorm that howled across the dunes like a crazed beast. She huddled behind a rocky outcropping, wrapping her cloak around her slim form as the wind tore at her with teeth of ice and fire. When it finally abated, she emerged from her shelter, coated in dust and parched with thirst, to find that the landscape had changed. Before her stood a mountain range, its peaks shrouded in mist and mystery.
262
+
263
+ Climbing was slow and arduous work, but she persevered, driven by the promise of what lay ahead. At last, she reached the summit and gazed out upon the land below. There it was, sprawling across the horizon like a jewel upon velvet - Ithilien, the city of her dreams! Its white walls glittered like stars against the green and blue of surrounding fields, and the sound of laughter and music drifted up to greet her on the breeze.
264
+
265
+ The path down the mountain was steep and treacherous, but Lilaith descended with sure feet, her eyes fixed upon her destination. As she drew nearer, the details of the city became clearer: grand archways, tall spires, and a central square where throngs of people gathered to watch performers ply their trade. Her heart raced in her chest as she approached the gates, her lyre strung and ready to play.
266
+
267
+ "Halt!" called a guard, stepping into her path. His armor gleamed like polished steel, and his stern expression did not waver as he scrutinized her. "What business have you in the city, traveler?"
268
+
269
+ Swallowing her nerves, she raised her chin proudly. "I am Lilaith of Eldara, a bard seeking fame and fortune! Let me pass, and I shall regale you with songs of valor and romance that will stir your very soul!"
270
+
271
+ The guard eyed her skeptically, taking in the bedraggled state of her clothing and the weariness etched into her features. "Your kind are not welcome here," he growled. "Turn back now, or face the wrath of our king!"
272
+
273
+ "My kind? What do you mean?" she demanded, her eyes narrowing.
274
+
275
+ "Elves," he spat, his voice thick with contempt. "You are not allowed within these walls unless you come bearing gifts worthy of our lord's notice."
276
+
277
+ Her blood ran cold as she realized the truth: the legends spoke of friendship between her people and the humans of Ithilien, but times had changed. Now, they were reviled outsiders, unwelcome in their own homelands.
278
+
279
+ Tears of frustration welled up in her eyes, and she felt the weight of the world press down upon her shoulders. But then she lifted her chin once more and spoke with the confidence of one who knows their own worth. "I shall not turn back now. If your king refuses me entry, then I will simply make my name elsewhere. There is a whole world out there just waiting for my voice!"
280
+
281
+ With that, she spun on her heel and strode away, leaving the guard gaping after her. Her dreams of glory in Ithilien shattered, she set off in search of new adventures, determined to carve her own destiny. And though the path ahead was fraught with peril and heartache, she knew that she would face each challenge head-on, armed with nothing but her music and her courage.
282
+
283
+ </details>
284
+
285
+ <details>
286
+
287
+ <summary><b>Write me a 3,000 word opening chapter of a 'weird fiction' novel, drawing inspiration from the writing styles of China Mieville and Neil Gaiman. Use third person personal. Include dialogue and internal monologues. The POV character for the opening chapter should be a male in his 20s called Horton who has just come to the city looking for work.</b></summary>
288
+
289
+ Horton had arrived in New Crobuzon seeking his fortune, having heard whispers of the great city from his mother and father. He knew little more than its name; that it was vast beyond imagining and that there were opportunities for those willing to seek them out. He had been walking through the countryside for weeks before reaching the outskirts, where the land was dotted with fields and farmhouses that grew steadily closer together until they formed the sprawling mass of the city. The smells hit him first – a cacophony of scents, a miasma of industrial smoke and unwashed bodies. The air tasted thick and metallic in his mouth, and his eyes stung from the pollution. As he stepped onto the cobblestone streets, the noise was overwhelming, a cacophony of voices, wheels, and animals that filled his head like a thunderstorm.
290
+
291
+ He looked around, unsure of which direction to take. The buildings towered above him, looming and oppressive, built from dark bricks that seemed to absorb what little light penetrated the gloom. Everywhere, there were signs of life and activity – people rushing by, carrying baskets and pushing handcarts; children playing games in the gutters; strange creatures moving through the crowds, half-animal and half-machine, their mechanical limbs clanking against the cobbles. Horton felt small and lost among them all, his heart racing with excitement and fear.
292
+
293
+ As he walked further into the city, he noticed how the people around him were different from those back home. There were men and women of all sizes and shapes, wearing clothes that ranged from rags to finery. Many of them were not even human – there were beings with fishlike heads, scaled skin, and long, slender tails; others with horns curling from their foreheads or animalistic features like snouts or fangs. Some had extra limbs, while others lacked arms entirely, relying instead on tentacles or wings.
294
+
295
+ He marveled at the sights, his eyes wide with wonder. He had heard tales of such things, of course – every child did – but he had never seen anything like this before. It was a world unto itself, one that defied description. And yet, amidst all the chaos and diversity, there was a sense of order, of purpose. The streets may have been narrow and twisted, but they were clean enough, and the shops and stalls were well stocked. Clearly, there was wealth here.
296
+
297
+ Eventually, he came upon a tavern called The Spiral. It was nestled between two larger buildings, almost hidden away, but he could hear laughter and music spilling from within. He decided to go inside.
298
+
299
+ The Spiral was dimly lit and crowded, packed with patrons of all sorts. There were humans and non-humans alike, talking loudly over their drinks. At first glance, it appeared to be an ordinary establishment, but as Horton looked closer, he noticed the peculiarities. For one, the furniture was arranged in a spiral pattern, starting with a single table near the door and winding inwards until it reached a central point. This central table was occupied by a group of individuals who spoke in hushed tones and glanced up at him warily as he entered.
300
+
301
+ "What'll ya have?" asked the bartender, a gruff man with a thick beard and piercing blue eyes. His voice was deep and rumbled like thunder. "Ale or grog?"
302
+
303
+ "Ale," said Horton, pulling out a few coins from his pocket. The bartender grunted and slammed a tankard down on the counter before filling it from a wooden barrel.
304
+
305
+ "Drink up, young 'un," he said, sliding it towards him. "You'll need it if you plan to stay long."
306
+
307
+ Horton took a sip and winced as the bitter liquid burned his throat. It was strong, much stronger than anything he'd ever had before, but he liked it. It made him feel alive, invigorated. He sat down at the nearest empty table, watching the people around him.
308
+
309
+ The Spiral was a hub of activity. Merchants discussed deals, laborers talked about their jobs, and artists painted pictures that captured the essence of the city. There was a furry female with long, pointed ears playing a string instrument in one corner, her melody haunting and beautiful. Nearby, a group of men and women with strange, twisted limbs danced to the music, moving in ways that seemed impossible.
310
+
311
+ He saw a man with the head of a bull arguing with a woman whose skin was covered in iridescent feathers. A creature that looked like a giant insect buzzed past his table, its chitinous exoskeleton gleaming under the candlelight. And in another booth, a pair of figures were engaged in a heated debate, their words too quiet to be heard over the din. One was a human dressed in fine robes, the other a tall figure with the head of a fox.
312
+
313
+ As Horton sipped his ale, he couldn't help but listen in. The human spoke in a low, urgent tone, gesturing animatedly with his hands. "You don't understand, Mr. Pashkov," he said. "The situation is dire. If we do not act soon, the consequences will be catastrophic!"
314
+
315
+ The fox-man nodded thoughtfully, stroking his chin with a clawed digit. "I appreciate your concern," he replied calmly. "But I assure you, everything is under control."
316
+
317
+ "How can it be?" demanded the human. "The factory workers are threatening to go on strike! If they do, production will grind to a halt, and we'll lose everything!"
318
+
319
+ Mr. Pashkov leaned forward, his eyes narrowing. "And what would you have me do about it?"
320
+
321
+ The human hesitated, then leaned in as well. "You know what needs to be done," he murmured. "You're the only one who can stop them."
322
+
323
+ Horton tried to look inconspicuous as he eavesdropped, pretending to study the carvings on the tabletop. What were they talking about? Who was Mr. Pashkov, and why did he have so much power?
324
+
325
+ Suddenly, the door to the tavern swung open, and a blast of cold air swept through the room. Everyone fell silent as a newcomer entered, his cloak billowing dramatically behind him. He was tall, with sharp features and piercing green eyes that seemed to bore into every soul present. His skin was pale, almost translucent, and his hair was a wild mess of red curls. In one hand, he carried a staff that crackled with arcane energy; in the other, a book bound in leather and metal.
326
+
327
+ The silence lasted only a moment before erupting into applause and cheers. The fox-man rose from his seat, grinning broadly as he approached the new arrival.
328
+
329
+ "Welcome back, Mr. Cutter!" he exclaimed, clasping the other's hand. "We've missed you!"
330
+
331
+ The stranger nodded, allowing himself to be led to the central table. He moved with an easy grace that belied his height, settling into a chair opposite the bull-headed man.
332
+
333
+ "Thank you," he said softly, his voice resonant and melodious. "It is good to be back in New Crobuzon."
334
+
335
+ The bull-man scoffed. "Back already? Didn't take you long to get bored of the countryside."
336
+
337
+ Mr. Pashkov smiled thinly. "I had business to attend to."
338
+
339
+ The conversation continued, but Horton found himself unable to concentrate. Instead, his mind drifted to his own situation – jobless, homeless, and alone in this massive city. What was he supposed to do now? Where would he go?
340
+
341
+ As he pondered these questions, he felt someone tap him on the shoulder. Turning around, he found himself face to face with a young woman, her skin a deep shade of blue and her eyes like twin sapphires. She wore a simple dress, but it was clear that she had taken care to keep herself clean and neat.
342
+
343
+ "Excuse me," she said, her voice low and musical. "Would you like some company?"
344
+
345
+ Horton nodded eagerly, feeling his cheeks flush. The girl sat down across from him, her tail swishing gently behind her.
346
+
347
+ "My name's Tara," she said. "What's yours?"
348
+
349
+ "Horton."
350
+
351
+ "Pleasure to meet you, Horton. You're not from around here, are you?"
352
+
353
+ He shook his head. "No, I just arrived today. I'm looking for work."
354
+
355
+ Tara raised an eyebrow. "Really? Well, that shouldn't be too hard to find. There are plenty of places that could use a strong pair of arms."
356
+
357
+ He managed a smile. "That's what I'm hoping."
358
+
359
+ They talked for a while longer, sharing stories about their lives and backgrounds. Tara revealed that she was an acrobat in a traveling troupe, currently taking a break while her colleagues performed elsewhere. Horton told her about growing up on his family's farm, where the biggest adventure had been the yearly trip to market.
360
+
361
+ As they conversed, he began to relax, the weight of his troubles lifting slightly. For the first time since leaving home, he felt a spark of hope. Perhaps there was a place for him here after all.
362
+
363
+ When Tara finished her drink, she stood up to leave. "It's been nice meeting you, Horton," she said, extending a delicate hand. "If you ever want to see my show, just come by the theater and ask for me."
364
+
365
+ "I'd love to," he replied, taking her hand in his own. It was warm, with slim, nimble fingers. He felt a surge of attraction that took him by surprise.
366
+
367
+ "Good luck with your job search," she added, turning to go. "I'm sure you'll find something."
368
+
369
+ With that, she disappeared into the crowd, leaving him alone once again.
370
+
371
+ Horton downed the rest of his ale and pushed away from the table, his thoughts racing. He had no idea how to find a job, nor did he know where to begin his search. But he knew one thing: he wasn't going anywhere until he'd found his place in this strange and wonderful city.
372
+
373
+ </details>
374
+
375
+ I chose the hard sci-fi example to test positivity bias. It did require some prompting, but it was willing to kill the protagonist.
376
+
377
+ I chose the high fantasy example to see whether it would bleed human features through to elves, this didn't occur.
378
+
379
+ I chose the weird fiction example to see if the LLM understood a niche genre. I'd say it performed okay, better on style than on substance.
380
+
381
+ # Merge Strategy
382
+
383
+ First, we create two base models:
384
+
385
+ * Moist-Chronos - This is a slerp merge of Chronos into Moist Miqu
386
+ * Moist-Euryale - This is a slerp merge of Euryale into Moist Miqu
387
+
388
+ After this, we use a `model_stack` merge to bring in these two bases alongside Sunfall Midnight Miqu.
389
+
390
+ ## Models Used
391
+
392
+ The following models were used to create Nimbus-Miqu-v0.1-70B:
393
+
394
+ * [TheDrummer/Moist-Miqu-70B-v1.1](https://huggingface.co/TheDrummer/Moist-Miqu-70B-v1.1)
395
+ * [elinas/chronos-70b-v2](https://huggingface.co/elinas/chronos-70b-v2)
396
+ * [Sao10K/Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B)
397
+ * [crestf411/sunfall-midnight-miqu-v0.2-v1.5-70B](https://huggingface.co/crestf411/sunfall-midnight-miqu-v0.2-v1.5-70B)
398
+
399
+ ## Mergekit Configs
400
+
401
+ ### Moist-Chronos
402
+
403
+ ```yaml
404
+ models:
405
+ - model: TheDrummer/Moist-Miqu-70B-v1.1
406
+ - model: elinas/chronos-70b-v2
407
+ merge_method: slerp
408
+ base_model: TheDrummer/Moist-Miqu-70B-v1.1
409
+ parameters:
410
+ t:
411
+ - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
412
+ embed_slerp: true
413
+ dtype: float16
414
+ ```
415
+
416
+ ### Moist-Euryale
417
+
418
+ ```yaml
419
+ models:
420
+ - model: TheDrummer/Moist-Miqu-70B-v1.1
421
+ - model: Sao10K/Euryale-1.3-L2-70B
422
+ merge_method: slerp
423
+ base_model: TheDrummer/Moist-Miqu-70B-v1.1
424
+ parameters:
425
+ t:
426
+ - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
427
+ embed_slerp: true
428
+ dtype: float16
429
+ ```
430
+
431
+ ### Nimbus-Miqu
432
+
433
+ ```yaml
434
+ models:
435
+ - model: TheDrummer/Moist-Miqu-70B-v1.1
436
+ - model: ./fp16/Moist-Chronos-Miqu-70B
437
+ - model: ./fp16/Moist-Euryale-Miqu-70B
438
+ - model: crestf411/sunfall-midnight-miqu-v0.2-v1.5-70B
439
+ base_model: TheDrummer/Moist-Miqu-70B-v1.1
440
+ merge_method: model_stock
441
+ dtype: float16
442
+ ```
443
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
444
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_invisietch__Nimbus-Miqu-v0.1-70B)
445
+
446
+ | Metric |Value|
447
+ |-------------------|----:|
448
+ |Avg. |24.71|
449
+ |IFEval (0-Shot) |46.47|
450
+ |BBH (3-Shot) |43.45|
451
+ |MATH Lvl 5 (4-Shot)| 5.44|
452
+ |GPQA (0-shot) |11.86|
453
+ |MuSR (0-shot) | 9.33|
454
+ |MMLU-PRO (5-shot) |31.70|
455
+
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "TheDrummer/Moist-Miqu-70B-v1.1",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 1,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 8192,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 28672,
14
+ "max_position_embeddings": 32764,
15
+ "mlp_bias": false,
16
+ "model_type": "llama",
17
+ "num_attention_heads": 64,
18
+ "num_hidden_layers": 80,
19
+ "num_key_value_heads": 8,
20
+ "pad_token_id": 0,
21
+ "pretraining_tp": 1,
22
+ "rms_norm_eps": 1e-05,
23
+ "rope_scaling": null,
24
+ "rope_theta": 1000000,
25
+ "tie_word_embeddings": false,
26
+ "torch_dtype": "float16",
27
+ "transformers_version": "4.41.2",
28
+ "use_cache": true,
29
+ "vocab_size": 32000,
30
+ "quantization_config": {
31
+ "quant_method": "exl2",
32
+ "version": "0.0.21",
33
+ "bits": 5.5,
34
+ "head_bits": 6,
35
+ "calibration": {
36
+ "rows": 100,
37
+ "length": 2048,
38
+ "dataset": "wiki2.parquet"
39
+ }
40
+ }
41
+ }
huggingface-metadata.txt ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ url: https://huggingface.co/invisietch/Nimbus-Miqu-v0.1-70B
2
+ branch: main
3
+ download date: 2024-08-26 16:34:21
4
+ sha256sum:
5
+ ea7316fb91358b959acd58a9ab8c0b00878f31cc431281c846fcf0b91772a9df model-00001-of-00018.safetensors
6
+ 63bc8e15ba723cb9f934194b83da6dfca93174e66d87f25285cd40f556eac4a2 model-00002-of-00018.safetensors
7
+ 8514abb956a4e729a3665801c1b3704b27f02df852fa20c381cad8b3d004e122 model-00003-of-00018.safetensors
8
+ 2c3ad6937b5493b10dfe80bf7a54bbc4c712b58c12c4f0f7e0ee6c63ffd39408 model-00004-of-00018.safetensors
9
+ 7a73722173d4a09e7b6b286a576dd58ddafd11c75fe45f1272866f11eda906e5 model-00005-of-00018.safetensors
10
+ 9aaf86f23e75fbc28dbff46d798aece84c794b5900617df4eb2e0d00bdc487f7 model-00006-of-00018.safetensors
11
+ 7d737b928f777a10f7771e111512bdcc3b745248025be81131adb451bbf869fa model-00007-of-00018.safetensors
12
+ 3feca152f8cc28896c2767e5fa31d27a26e2a087f6fe2a3814ef3f3bc792fcc3 model-00008-of-00018.safetensors
13
+ 86d055c6577c3ff5551437976441baf30c89978b89a6d62e56e9e9ec4a5d8853 model-00009-of-00018.safetensors
14
+ deeff57fc2872b5e3bd2c4dc63c8e4c095e61e499e611710668f61943ba7d40a model-00010-of-00018.safetensors
15
+ c2da1e57244eda1af0013f9a92b1305c6fad4891b9e0926dea3a78c1ccd4d75f model-00011-of-00018.safetensors
16
+ 075f18f9b687be7843a05eafff2ca936a90edbc9358c6f4a0fffe0cb2d5c7c8e model-00012-of-00018.safetensors
17
+ 878436565f7ceff7c2a0935e456ddc6b8f5ee786cf9ba800c64a70ac0ea54f9c model-00013-of-00018.safetensors
18
+ 72d194acafa5695038187eb2868c0385ff0d50dbdf4180be464a8edf805b672d model-00014-of-00018.safetensors
19
+ 6cf32f86afb67aa2dbfe88314fd0e8cd23ebd2a18c250e12362927c165856e92 model-00015-of-00018.safetensors
20
+ 9e393d1302b454e3835b1ce255798c9ef2afa7cfce007b4faf7d777b7f8f71a4 model-00016-of-00018.safetensors
21
+ bc7d474a2f12832f60399a91630ab801e80bd371a5bbe73f6a083868dd03c2e2 model-00017-of-00018.safetensors
22
+ 5b77f47b19a4c4d8c0a701ce518dd380d26e67c30198a24cbc61c3916c8cd4e6 model-00018-of-00018.safetensors
23
+ 9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347 tokenizer.model
measurement.json ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors.index.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"metadata": {"mergekit_version": "0.0.4.2", "total_size": 137953296384}, "weight_map": {"lm_head.weight": "model-00001-of-00018.safetensors", "model.embed_tokens.weight": "model-00001-of-00018.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00018.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00018.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.input_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00018.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.input_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00018.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00018.safetensors", "model.layers.12.input_layernorm.weight": "model-00001-of-00018.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00018.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.input_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00018.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.input_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00018.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.input_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00018.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00018.safetensors", "model.layers.16.input_layernorm.weight": "model-00002-of-00018.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00002-of-00018.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00018.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00003-of-00018.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00018.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00018.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00003-of-00018.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.input_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00003-of-00018.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.input_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00003-of-00018.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.input_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00003-of-00018.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.input_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00003-of-00018.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00003-of-00018.safetensors", "model.layers.20.input_layernorm.weight": "model-00003-of-00018.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00004-of-00018.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.input_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00004-of-00018.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.input_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00004-of-00018.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.input_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00004-of-00018.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00004-of-00018.safetensors", "model.layers.24.input_layernorm.weight": "model-00004-of-00018.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00004-of-00018.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00004-of-00018.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00005-of-00018.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00005-of-00018.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00005-of-00018.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00005-of-00018.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.input_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00005-of-00018.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.input_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00005-of-00018.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.input_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00005-of-00018.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.input_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00005-of-00018.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00005-of-00018.safetensors", "model.layers.29.input_layernorm.weight": "model-00005-of-00018.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00006-of-00018.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.input_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00006-of-00018.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.input_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00006-of-00018.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.input_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00006-of-00018.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00006-of-00018.safetensors", "model.layers.32.input_layernorm.weight": "model-00006-of-00018.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00006-of-00018.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00006-of-00018.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00007-of-00018.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00007-of-00018.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00007-of-00018.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00007-of-00018.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.input_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00007-of-00018.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.input_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00007-of-00018.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.input_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00007-of-00018.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.input_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.36.mlp.down_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.mlp.gate_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.mlp.up_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.post_attention_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.36.self_attn.k_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.self_attn.o_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.self_attn.q_proj.weight": "model-00007-of-00018.safetensors", "model.layers.36.self_attn.v_proj.weight": "model-00007-of-00018.safetensors", "model.layers.37.input_layernorm.weight": "model-00007-of-00018.safetensors", "model.layers.37.mlp.down_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.mlp.gate_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.mlp.up_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.post_attention_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.37.self_attn.k_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.self_attn.o_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.self_attn.q_proj.weight": "model-00008-of-00018.safetensors", "model.layers.37.self_attn.v_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.input_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.38.mlp.down_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.mlp.gate_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.mlp.up_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.post_attention_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.38.self_attn.k_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.self_attn.o_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.self_attn.q_proj.weight": "model-00008-of-00018.safetensors", "model.layers.38.self_attn.v_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.input_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.39.mlp.down_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.mlp.gate_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.mlp.up_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.post_attention_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.39.self_attn.k_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.self_attn.o_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.self_attn.q_proj.weight": "model-00008-of-00018.safetensors", "model.layers.39.self_attn.v_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.input_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00008-of-00018.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00008-of-00018.safetensors", "model.layers.40.input_layernorm.weight": "model-00008-of-00018.safetensors", "model.layers.40.mlp.down_proj.weight": "model-00008-of-00018.safetensors", "model.layers.40.mlp.gate_proj.weight": "model-00008-of-00018.safetensors", "model.layers.40.mlp.up_proj.weight": "model-00009-of-00018.safetensors", "model.layers.40.post_attention_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.40.self_attn.k_proj.weight": "model-00009-of-00018.safetensors", "model.layers.40.self_attn.o_proj.weight": "model-00009-of-00018.safetensors", "model.layers.40.self_attn.q_proj.weight": "model-00009-of-00018.safetensors", "model.layers.40.self_attn.v_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.input_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.41.mlp.down_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.mlp.gate_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.mlp.up_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.post_attention_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.41.self_attn.k_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.self_attn.o_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.self_attn.q_proj.weight": "model-00009-of-00018.safetensors", "model.layers.41.self_attn.v_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.input_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.42.mlp.down_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.mlp.gate_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.mlp.up_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.post_attention_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.42.self_attn.k_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.self_attn.o_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.self_attn.q_proj.weight": "model-00009-of-00018.safetensors", "model.layers.42.self_attn.v_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.input_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.43.mlp.down_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.mlp.gate_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.mlp.up_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.post_attention_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.43.self_attn.k_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.self_attn.o_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.self_attn.q_proj.weight": "model-00009-of-00018.safetensors", "model.layers.43.self_attn.v_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.input_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.44.mlp.down_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.mlp.gate_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.mlp.up_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.post_attention_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.44.self_attn.k_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.self_attn.o_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.self_attn.q_proj.weight": "model-00009-of-00018.safetensors", "model.layers.44.self_attn.v_proj.weight": "model-00009-of-00018.safetensors", "model.layers.45.input_layernorm.weight": "model-00009-of-00018.safetensors", "model.layers.45.mlp.down_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.mlp.gate_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.mlp.up_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.post_attention_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.45.self_attn.k_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.self_attn.o_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.self_attn.q_proj.weight": "model-00010-of-00018.safetensors", "model.layers.45.self_attn.v_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.input_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.46.mlp.down_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.mlp.gate_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.mlp.up_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.post_attention_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.46.self_attn.k_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.self_attn.o_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.self_attn.q_proj.weight": "model-00010-of-00018.safetensors", "model.layers.46.self_attn.v_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.input_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.47.mlp.down_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.mlp.gate_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.mlp.up_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.post_attention_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.47.self_attn.k_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.self_attn.o_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.self_attn.q_proj.weight": "model-00010-of-00018.safetensors", "model.layers.47.self_attn.v_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.input_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.48.mlp.down_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.mlp.gate_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.mlp.up_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.post_attention_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.48.self_attn.k_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.self_attn.o_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.self_attn.q_proj.weight": "model-00010-of-00018.safetensors", "model.layers.48.self_attn.v_proj.weight": "model-00010-of-00018.safetensors", "model.layers.49.input_layernorm.weight": "model-00010-of-00018.safetensors", "model.layers.49.mlp.down_proj.weight": "model-00010-of-00018.safetensors", "model.layers.49.mlp.gate_proj.weight": "model-00010-of-00018.safetensors", "model.layers.49.mlp.up_proj.weight": "model-00011-of-00018.safetensors", "model.layers.49.post_attention_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.49.self_attn.k_proj.weight": "model-00011-of-00018.safetensors", "model.layers.49.self_attn.o_proj.weight": "model-00011-of-00018.safetensors", "model.layers.49.self_attn.q_proj.weight": "model-00011-of-00018.safetensors", "model.layers.49.self_attn.v_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.input_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00011-of-00018.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.input_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.50.mlp.down_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.mlp.gate_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.mlp.up_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.post_attention_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.50.self_attn.k_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.self_attn.o_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.self_attn.q_proj.weight": "model-00011-of-00018.safetensors", "model.layers.50.self_attn.v_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.input_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.51.mlp.down_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.mlp.gate_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.mlp.up_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.post_attention_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.51.self_attn.k_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.self_attn.o_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.self_attn.q_proj.weight": "model-00011-of-00018.safetensors", "model.layers.51.self_attn.v_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.input_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.52.mlp.down_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.mlp.gate_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.mlp.up_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.post_attention_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.52.self_attn.k_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.self_attn.o_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.self_attn.q_proj.weight": "model-00011-of-00018.safetensors", "model.layers.52.self_attn.v_proj.weight": "model-00011-of-00018.safetensors", "model.layers.53.input_layernorm.weight": "model-00011-of-00018.safetensors", "model.layers.53.mlp.down_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.mlp.gate_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.mlp.up_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.post_attention_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.53.self_attn.k_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.self_attn.o_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.self_attn.q_proj.weight": "model-00012-of-00018.safetensors", "model.layers.53.self_attn.v_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.input_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.54.mlp.down_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.mlp.gate_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.mlp.up_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.post_attention_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.54.self_attn.k_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.self_attn.o_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.self_attn.q_proj.weight": "model-00012-of-00018.safetensors", "model.layers.54.self_attn.v_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.input_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.55.mlp.down_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.mlp.gate_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.mlp.up_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.post_attention_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.55.self_attn.k_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.self_attn.o_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.self_attn.q_proj.weight": "model-00012-of-00018.safetensors", "model.layers.55.self_attn.v_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.input_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.56.mlp.down_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.mlp.gate_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.mlp.up_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.post_attention_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.56.self_attn.k_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.self_attn.o_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.self_attn.q_proj.weight": "model-00012-of-00018.safetensors", "model.layers.56.self_attn.v_proj.weight": "model-00012-of-00018.safetensors", "model.layers.57.input_layernorm.weight": "model-00012-of-00018.safetensors", "model.layers.57.mlp.down_proj.weight": "model-00012-of-00018.safetensors", "model.layers.57.mlp.gate_proj.weight": "model-00012-of-00018.safetensors", "model.layers.57.mlp.up_proj.weight": "model-00013-of-00018.safetensors", "model.layers.57.post_attention_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.57.self_attn.k_proj.weight": "model-00013-of-00018.safetensors", "model.layers.57.self_attn.o_proj.weight": "model-00013-of-00018.safetensors", "model.layers.57.self_attn.q_proj.weight": "model-00013-of-00018.safetensors", "model.layers.57.self_attn.v_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.input_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.58.mlp.down_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.mlp.gate_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.mlp.up_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.post_attention_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.58.self_attn.k_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.self_attn.o_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.self_attn.q_proj.weight": "model-00013-of-00018.safetensors", "model.layers.58.self_attn.v_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.input_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.59.mlp.down_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.mlp.gate_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.mlp.up_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.post_attention_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.59.self_attn.k_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.self_attn.o_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.self_attn.q_proj.weight": "model-00013-of-00018.safetensors", "model.layers.59.self_attn.v_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.input_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00013-of-00018.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.input_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.60.mlp.down_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.mlp.gate_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.mlp.up_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.post_attention_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.60.self_attn.k_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.self_attn.o_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.self_attn.q_proj.weight": "model-00013-of-00018.safetensors", "model.layers.60.self_attn.v_proj.weight": "model-00013-of-00018.safetensors", "model.layers.61.input_layernorm.weight": "model-00013-of-00018.safetensors", "model.layers.61.mlp.down_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.mlp.gate_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.mlp.up_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.post_attention_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.61.self_attn.k_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.self_attn.o_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.self_attn.q_proj.weight": "model-00014-of-00018.safetensors", "model.layers.61.self_attn.v_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.input_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.62.mlp.down_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.mlp.gate_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.mlp.up_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.post_attention_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.62.self_attn.k_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.self_attn.o_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.self_attn.q_proj.weight": "model-00014-of-00018.safetensors", "model.layers.62.self_attn.v_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.input_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.63.mlp.down_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.mlp.gate_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.mlp.up_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.post_attention_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.63.self_attn.k_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.self_attn.o_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.self_attn.q_proj.weight": "model-00014-of-00018.safetensors", "model.layers.63.self_attn.v_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.input_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.64.mlp.down_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.mlp.gate_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.mlp.up_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.post_attention_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.64.self_attn.k_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.self_attn.o_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.self_attn.q_proj.weight": "model-00014-of-00018.safetensors", "model.layers.64.self_attn.v_proj.weight": "model-00014-of-00018.safetensors", "model.layers.65.input_layernorm.weight": "model-00014-of-00018.safetensors", "model.layers.65.mlp.down_proj.weight": "model-00014-of-00018.safetensors", "model.layers.65.mlp.gate_proj.weight": "model-00014-of-00018.safetensors", "model.layers.65.mlp.up_proj.weight": "model-00015-of-00018.safetensors", "model.layers.65.post_attention_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.65.self_attn.k_proj.weight": "model-00015-of-00018.safetensors", "model.layers.65.self_attn.o_proj.weight": "model-00015-of-00018.safetensors", "model.layers.65.self_attn.q_proj.weight": "model-00015-of-00018.safetensors", "model.layers.65.self_attn.v_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.input_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.66.mlp.down_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.mlp.gate_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.mlp.up_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.post_attention_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.66.self_attn.k_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.self_attn.o_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.self_attn.q_proj.weight": "model-00015-of-00018.safetensors", "model.layers.66.self_attn.v_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.input_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.67.mlp.down_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.mlp.gate_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.mlp.up_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.post_attention_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.67.self_attn.k_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.self_attn.o_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.self_attn.q_proj.weight": "model-00015-of-00018.safetensors", "model.layers.67.self_attn.v_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.input_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.68.mlp.down_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.mlp.gate_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.mlp.up_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.post_attention_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.68.self_attn.k_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.self_attn.o_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.self_attn.q_proj.weight": "model-00015-of-00018.safetensors", "model.layers.68.self_attn.v_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.input_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.69.mlp.down_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.mlp.gate_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.mlp.up_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.post_attention_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.69.self_attn.k_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.self_attn.o_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.self_attn.q_proj.weight": "model-00015-of-00018.safetensors", "model.layers.69.self_attn.v_proj.weight": "model-00015-of-00018.safetensors", "model.layers.7.input_layernorm.weight": "model-00015-of-00018.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00016-of-00018.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.input_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.70.mlp.down_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.mlp.gate_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.mlp.up_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.post_attention_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.70.self_attn.k_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.self_attn.o_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.self_attn.q_proj.weight": "model-00016-of-00018.safetensors", "model.layers.70.self_attn.v_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.input_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.71.mlp.down_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.mlp.gate_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.mlp.up_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.post_attention_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.71.self_attn.k_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.self_attn.o_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.self_attn.q_proj.weight": "model-00016-of-00018.safetensors", "model.layers.71.self_attn.v_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.input_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.72.mlp.down_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.mlp.gate_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.mlp.up_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.post_attention_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.72.self_attn.k_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.self_attn.o_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.self_attn.q_proj.weight": "model-00016-of-00018.safetensors", "model.layers.72.self_attn.v_proj.weight": "model-00016-of-00018.safetensors", "model.layers.73.input_layernorm.weight": "model-00016-of-00018.safetensors", "model.layers.73.mlp.down_proj.weight": "model-00016-of-00018.safetensors", "model.layers.73.mlp.gate_proj.weight": "model-00016-of-00018.safetensors", "model.layers.73.mlp.up_proj.weight": "model-00017-of-00018.safetensors", "model.layers.73.post_attention_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.73.self_attn.k_proj.weight": "model-00017-of-00018.safetensors", "model.layers.73.self_attn.o_proj.weight": "model-00017-of-00018.safetensors", "model.layers.73.self_attn.q_proj.weight": "model-00017-of-00018.safetensors", "model.layers.73.self_attn.v_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.input_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.74.mlp.down_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.mlp.gate_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.mlp.up_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.post_attention_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.74.self_attn.k_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.self_attn.o_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.self_attn.q_proj.weight": "model-00017-of-00018.safetensors", "model.layers.74.self_attn.v_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.input_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.75.mlp.down_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.mlp.gate_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.mlp.up_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.post_attention_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.75.self_attn.k_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.self_attn.o_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.self_attn.q_proj.weight": "model-00017-of-00018.safetensors", "model.layers.75.self_attn.v_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.input_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.76.mlp.down_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.mlp.gate_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.mlp.up_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.post_attention_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.76.self_attn.k_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.self_attn.o_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.self_attn.q_proj.weight": "model-00017-of-00018.safetensors", "model.layers.76.self_attn.v_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.input_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.77.mlp.down_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.mlp.gate_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.mlp.up_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.post_attention_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.77.self_attn.k_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.self_attn.o_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.self_attn.q_proj.weight": "model-00017-of-00018.safetensors", "model.layers.77.self_attn.v_proj.weight": "model-00017-of-00018.safetensors", "model.layers.78.input_layernorm.weight": "model-00017-of-00018.safetensors", "model.layers.78.mlp.down_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.mlp.gate_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.mlp.up_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.post_attention_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.78.self_attn.k_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.self_attn.o_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.self_attn.q_proj.weight": "model-00018-of-00018.safetensors", "model.layers.78.self_attn.v_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.input_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.79.mlp.down_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.mlp.gate_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.mlp.up_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.post_attention_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.79.self_attn.k_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.self_attn.o_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.self_attn.q_proj.weight": "model-00018-of-00018.safetensors", "model.layers.79.self_attn.v_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.input_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00018-of-00018.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.input_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00018-of-00018.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00018-of-00018.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00018-of-00018.safetensors", "model.norm.weight": "model-00018-of-00018.safetensors"}}
output-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4ba9d5e6046df41049bac89af63bb788c7ee1e153bd6c133b2f4325e8719f236
3
+ size 8458804448
output-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:849a925289f9156c33586969fe917e8e990956571466fe131dfd87b1814873e4
3
+ size 8465764184
output-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39e1ebabba3faaa3d0a62f410afe20c6e9f513536ad77df077ffd29272e53892
3
+ size 8422613728
output-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0114308e6d45bb4ca7d9c7f021cd5186806b1a0b8ea4f369d0331d13c3d21535
3
+ size 8511305544
output-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d8655055e2c3f4ca264c7ff5493f6696d66062ae43408b496b6954f342a3b2b
3
+ size 8499354608
output-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce23de89eefed7e6d052d834e6be5395268afb3f80a83790b370b0d300b42872
3
+ size 5421719734
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<unk>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": true,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
3
+ size 499723
tokenizer_config.json ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": true,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": true,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": true,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ }
29
+ },
30
+ "bos_token": "<s>",
31
+ "chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token}}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
32
+ "clean_up_tokenization_spaces": false,
33
+ "eos_token": "</s>",
34
+ "legacy": false,
35
+ "model_max_length": 1000000000000000019884624838656,
36
+ "pad_token": "<unk>",
37
+ "sp_model_kwargs": {},
38
+ "spaces_between_special_tokens": false,
39
+ "tokenizer_class": "LlamaTokenizer",
40
+ "unk_token": "<unk>",
41
+ "use_default_system_prompt": false
42
+ }