mindrage
/

Manticore-13B-Chat-Pyg-Guanaco-GGML

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mindrage commited on Jun 2, 2023

Commit

c57ec4e

•

1 Parent(s): bf582f4

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+tags:
+- manticore
+- guanaco
+- uncensored
+---
+---
+# 4bit GGML of:
+Manticore-13b-Chat-Pyg by [openaccess-ai-collective](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg) with the Guanaco 13b qLoRa by [TimDettmers](https://huggingface.co/timdettmers/guanaco-13b) applied through [Monero](https://huggingface.co/Monero/Manticore-13b-Chat-Pyg-Guanaco), quantized by [mindrage](https://huggingface.co/mindrage), uncensored
+[link to GPTQ Version](https://huggingface.co/mindrage/Manticore-13B-Chat-Pyg-Guanaco-GPTQ-4bit-128g.no-act-order.safetensors)
+---
+Quantized to 4bit GGML (4_0) using the newest llama.cpp and will therefore only work with llama.cpp versions compiled after May 19th, 2023.
+The model seems to have noticeably benefited from further augmentation with the Guanaco qLora.
+Its capabilities seem broad, even compared with other Wizard or Manticore models, with expected weaknesses at coding. It is very good at in-context-learning and (in its class) reasoning.
+It both follows instructions well, and can be used as a chatbot.
+Refreshingly, it does not seem to insist on aggressively sticking to narratives to justify formerly hallucinated output as much as similar models. It's output seems... eerily smart at times.
+I believe the model is fully unrestricted/uncensored and will generally not berate.
+---
+Prompting style + settings:
+---
+Presumably due to the very diverse training-data the model accepts a variety of prompting styles with relatively few issues, including the ###-Variant, but seems to work best using:
+# "Naming" the model works great by simply modifying the context. Substantial changes in its behaviour can be caused very simply by appending to "ASSISTANT:", like "ASSISTANT: After careful consideration, thinking step-by-step, my response is:"
+user: "USER:" -
+bot: "ASSISTANT:" -
+context: "This is a conversation between an advanced AI and a human user."
+Turn Template: <|user|> <|user-message|>\n<|bot|><|bot-message|>\n
+Settings that work well without (subjectively) being too deterministic:
+temp: 0.15 -
+top_p: 0.1 -
+top_k: 40 -
+rep penalty: 1.1
+---