Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- manticore
|
4 |
+
- guanaco
|
5 |
+
- uncensored
|
6 |
+
|
7 |
+
---
|
8 |
+
---
|
9 |
+
# 4bit GGML of:
|
10 |
+
Manticore-13b-Chat-Pyg by [openaccess-ai-collective](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg) with the Guanaco 13b qLoRa by [TimDettmers](https://huggingface.co/timdettmers/guanaco-13b) applied through [Monero](https://huggingface.co/Monero/Manticore-13b-Chat-Pyg-Guanaco), quantized by [mindrage](https://huggingface.co/mindrage), uncensored
|
11 |
+
|
12 |
+
|
13 |
+
[link to GPTQ Version](https://huggingface.co/mindrage/Manticore-13B-Chat-Pyg-Guanaco-GPTQ-4bit-128g.no-act-order.safetensors)
|
14 |
+
|
15 |
+
---
|
16 |
+
|
17 |
+
|
18 |
+
Quantized to 4bit GGML (4_0) using the newest llama.cpp and will therefore only work with llama.cpp versions compiled after May 19th, 2023.
|
19 |
+
|
20 |
+
|
21 |
+
The model seems to have noticeably benefited from further augmentation with the Guanaco qLora.
|
22 |
+
Its capabilities seem broad, even compared with other Wizard or Manticore models, with expected weaknesses at coding. It is very good at in-context-learning and (in its class) reasoning.
|
23 |
+
It both follows instructions well, and can be used as a chatbot.
|
24 |
+
Refreshingly, it does not seem to insist on aggressively sticking to narratives to justify formerly hallucinated output as much as similar models. It's output seems... eerily smart at times.
|
25 |
+
I believe the model is fully unrestricted/uncensored and will generally not berate.
|
26 |
+
|
27 |
+
---
|
28 |
+
|
29 |
+
Prompting style + settings:
|
30 |
+
---
|
31 |
+
Presumably due to the very diverse training-data the model accepts a variety of prompting styles with relatively few issues, including the ###-Variant, but seems to work best using:
|
32 |
+
# "Naming" the model works great by simply modifying the context. Substantial changes in its behaviour can be caused very simply by appending to "ASSISTANT:", like "ASSISTANT: After careful consideration, thinking step-by-step, my response is:"
|
33 |
+
|
34 |
+
user: "USER:" -
|
35 |
+
bot: "ASSISTANT:" -
|
36 |
+
context: "This is a conversation between an advanced AI and a human user."
|
37 |
+
|
38 |
+
Turn Template: <|user|> <|user-message|>\n<|bot|><|bot-message|>\n
|
39 |
+
|
40 |
+
Settings that work well without (subjectively) being too deterministic:
|
41 |
+
|
42 |
+
temp: 0.15 -
|
43 |
+
top_p: 0.1 -
|
44 |
+
top_k: 40 -
|
45 |
+
rep penalty: 1.1
|
46 |
+
---
|