Deeokay
/

mistral-7b-v0.3-custom-tokens-GGUF

@@ -20,3 +20,104 @@ tags:
 This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
+# README
+This is a test model on a the following
+- a private dataset
+- slight customization on alpaca chat template
+- Works with Ollama create but requires customization to Modelfile
+- One reason for this was wanted to try doing Q2_K and see if it was actually good(?) -> Exceeds Expectation!!
+- My examples will be based on unslot.Q2_K.GGUF file, however other quantization should work as well
+# HOW TO USE
+The whole point of conversion for me was I wanted to be able to to use it through Ollama or (other local options)
+For Ollama, it required to be a GGUF file. Once you have this it is pretty straight forward
+If you want to try it first, the Q2_K version of this is available in Ollama => deeokay/minimistral
+```python
+ollama pull deeokay/minimistral
+```
+# Quick Start:
+- You must already have Ollama running in your setting
+- Download the unsloth.Q2_K.gguf model from Files
+- In the same directory create a file call "Modelfile"
+- Inside the "Modelfile" type
+```python
+FROM ./mistrial_unsloth.Q2_K.gguf
+PARAMETER stop <|STOP|>
+PARAMETER stop "<|STOP|>"
+PARAMETER stop <|END_RESPONSE|>
+PARAMETER stop "<|END_RESPONSE|>"
+PARAMETER temperature 0.4
+TEMPLATE """<|BEGIN_QUERY|>
+{{.Prompt}}
+<|END_QUERY|>
+<|BEGIN_RESPONSE|>
+"""
+SYSTEM """You are an AI assistant. Respond to the user's query between the BEGIN_QUERY and END_QUERY tokens. Use the appropriate BEGIN_ and END_ tokens for different types of content in your response.""""""
+```
+- Save a go back to the folder (folder where model + Modelfile exisit)
+- Now in terminal make sure you are in the same location of the folder and type in the following command
+```python
+ollama create mycustomai  # "mycustomai" <- you can name it anything u want
+```
+After than you should be able to use this model to chat!
+This GGUF is based on unsloth/mistral-7b-instruct-v0.3-bnb-4bit by Unslot,
+# NOTE: DISCLAIMER
+Please note this is not for the purpose of production, but result of Fine Tuning through self learning
+This is my Fine Tuning pass through with personalized customized dataset.
+Please feel free to customize the Modelfile, and if you do get a better response than mine, please share!!
+If would like to know how I started creating my dataset, you can check this link
+[Crafting GPT2 for Personalized AI-Preparing Data the Long Way (Part1)](https://medium.com/@deeokay/the-soul-in-the-machine-crafting-gpt2-for-personalized-ai-9d38be3f635f)
+As the data was getting created with custom GPT2 special tokens, I had to convert that to the a Alpaca Template.
+However I got creative again.. the training data has the following Template:
+```python
+special_tokens_dict = {
+    'eos_token': '<|STOP|>',
+    'bos_token': '<|STOP|>',
+    'pad_token': '<|PAD|>',
+    'additional_special_tokens': ['<|BEGIN_QUERY|>', '<|BEGIN_QUERY|>',
+                                  '<|BEGIN_ANALYSIS|>', '<|END_ANALYSIS|>',
+                                  '<|BEGIN_RESPONSE|>', '<|END_RESPONSE|>',
+                                  '<|BEGIN_SENTIMENT|>', '<|END_SENTIMENT|>',
+                                  '<|BEGIN_CLASSIFICATION|>', '<|END_CLASSIFICATION|>',]
+}
+tokenizer.add_special_tokens(special_tokens_dict)
+model.resize_token_embeddings(len(tokenizer))
+tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids('<|STOP|>')
+tokenizer.bos_token_id = tokenizer.convert_tokens_to_ids('<|STOP|>')
+tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids('<|PAD|>')
+```
+The data is in the following format:
+```python
+def combine_text(user_prompt, analysis, sentiment, new_response, classification):
+    user_q = f"<|STOP|><|BEGIN_QUERY|>{user_prompt}<|END_QUERY|>"
+    analysis = f"<|BEGIN_ANALYSIS|>{analysis}<|END_ANALYSIS|>"
+    new_response = f"<|BEGIN_RESPONSE|>{new_response}<|END_RESPONSE|>"
+    classification = f"<|BEGIN_CLASSIFICATION|>{classification}<|END_CLASSIFICATION|>"
+    sentiment = f"<|BEGIN_SENTIMENT|>Sentiment: {sentiment}<|END_SENTIMENT|><|STOP|>"
+    return user_q + analysis + new_response + classification + sentiment
+```