aashish1904 commited on
Commit
cf851bb
1 Parent(s): 15f85d6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ base_model:
5
+ - Qwen/Qwen2.5-1.5B-Instruct
6
+ library_name: peft
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+ - llama-factory
11
+ - lora
12
+ datasets:
13
+ - allura-org/fujin-cleaned-stage-1
14
+ - Dampfinchen/Creative_Writing_Multiturn
15
+ - ToastyPigeon/SpringDragon
16
+ - allura-org/medquad_sharegpt
17
+ - allura-org/scienceqa_sharegpt
18
+ - Alignment-Lab-AI/orcamath-sharegpt
19
+
20
+ ---
21
+
22
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
23
+
24
+
25
+ # QuantFactory/Q25-1.5B-VeoLu-GGUF
26
+ This is quantized version of [Alfitaria/Q25-1.5B-VeoLu](https://huggingface.co/Alfitaria/Q25-1.5B-VeoLu) created using llama.cpp
27
+
28
+ # Original Model Card
29
+
30
+ # Q25-1.5-VeoLu-R2
31
+ ![made with StableNoobAI-IterSPO in sd-webui-forge](veolu.png)
32
+ Q25-1.5B-Veo Lu is a tiny General-Purpose Creative model, made up of a merge of bespoke finetunes on Qwen 2.5-1.5B-Instruct.
33
+
34
+ Inspired by the success of [MN-12B-Mag Mell](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1) and [MS-Meadowlark-22B](https://huggingface.co/allura-org/MS-Meadowlark-22B), Veo Lu was trained on a healthy, balanced diet of of Internet fiction, roleplaying, adventuring, and reasoning/general knowledge.
35
+
36
+ The components of Veo Lu are:
37
+
38
+ * Bard (pretrain, writing): [Fujin (Cleaned/extended Rosier)](https://huggingface.co/allura-org/fujin-cleaned-stage-1)
39
+ * Scribe (pretrain, roleplay): [Creative Writing Multiturn](https://huggingface.co/Dampfinchen/Creative_Writing_Multiturn)
40
+ * Cartographer (pretrain, adventuring): [SpringDragon](https://huggingface.co/ToastyPigeon/SpringDragon)
41
+ * Alchemist (SFT, science/reasoning): [ScienceQA,](https://huggingface.co/allura-org/scienceqa_sharegpt) [MedquadQA,](https://huggingface.co/allura-org/medquad_sharegpt) [Orca Math Word Problems](https://huggingface.co/Alignment-Lab-AI/orcamath-sharegpt)
42
+
43
+ This model is capable of carrying on a scene without going completely off the rails. That being said, it only has 1.5B parameters. So please, for the love of God, *manage your expectations.*
44
+ Since it's Qwen, use ChatML formatting. Turn the temperature down to ~0.7-0.8 and try a dash of rep-pen.
45
+
46
+ GGUFs coming soon, but honestly, the full-precision model is 3.5GB in size. You might wanna have a go at running this unquantized with vLLM.
47
+ ```
48
+ pip install vllm
49
+ vllm serve Alfitaria/Q25-1.5B-VeoLu --max-model-len 16384 --max-num-seqs 1
50
+ ```
51
+
52
+ Made by inflatebot.
53
+
54
+ Special thanks to our friends at [Allura](https://huggingface.co/allura-org), and especially to [Auri](https://huggingface.co/AuriAetherwiing), who basically held my hand through the whole process. Her effort and enthusiasm carried this project forward.
55
+
56
+ ### Configuration
57
+
58
+ The following YAML configuration was used to produce this model:
59
+
60
+ ```yaml
61
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
62
+ dtype: bfloat16
63
+ merge_method: task_arithmetic
64
+ parameters:
65
+ normalize: 1.0
66
+ slices:
67
+ - sources:
68
+ - layer_range: [0, 28]
69
+ model: /home/asriel/AI/text/models/bard
70
+ parameters:
71
+ weight: 1.0
72
+ - layer_range: [0, 28]
73
+ model: /home/asriel/AI/text/models/scribe
74
+ parameters:
75
+ weight: 1.0
76
+ - layer_range: [0, 28]
77
+ model: /home/asriel/AI/text/models/cartographer
78
+ parameters:
79
+ weight: 1.0
80
+ - layer_range: [0, 28]
81
+ model: /home/asriel/AI/text/models/alchemist
82
+ parameters:
83
+ weight: 1.0
84
+ - layer_range: [0, 28]
85
+ model: Qwen/Qwen2.5-1.5B-Instruct
86
+ ```