Upload README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,17 @@ license: apache-2.0
|
|
5 |
model_creator: Intel
|
6 |
model_name: Neural Chat 7B v3-1
|
7 |
model_type: mistral
|
8 |
-
prompt_template: '
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
|
10 |
'
|
11 |
quantized_by: TheBloke
|
@@ -69,11 +79,17 @@ Here is an incomplete list of clients and libraries that are known to support GG
|
|
69 |
<!-- repositories-available end -->
|
70 |
|
71 |
<!-- prompt-template start -->
|
72 |
-
## Prompt template:
|
73 |
|
74 |
```
|
|
|
|
|
|
|
|
|
75 |
{prompt}
|
76 |
|
|
|
|
|
77 |
```
|
78 |
|
79 |
<!-- prompt-template end -->
|
@@ -191,7 +207,7 @@ Windows Command Line users: You can set the environment variable by running `set
|
|
191 |
Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
|
192 |
|
193 |
```shell
|
194 |
-
./main -ngl 32 -m neural-chat-7b-v3-1.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "{prompt}"
|
195 |
```
|
196 |
|
197 |
Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
|
@@ -285,9 +301,9 @@ And thank you again to a16z for their generous grant.
|
|
285 |
# Original model card: Intel's Neural Chat 7B v3-1
|
286 |
|
287 |
|
288 |
-
##
|
289 |
|
290 |
-
This model is a fine-tuned model based on [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the open source dataset [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca). Then we align it with DPO algorithm. For more details, you can refer our blog: [
|
291 |
|
292 |
## Model date
|
293 |
Neural-chat-7b-v3-1 was trained between September and October, 2023.
|
@@ -317,10 +333,22 @@ The following hyperparameters were used during training:
|
|
317 |
- total_train_batch_size: 64
|
318 |
- total_eval_batch_size: 8
|
319 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
320 |
-
- lr_scheduler_type:
|
321 |
-
- lr_scheduler_warmup_ratio: 0.
|
322 |
- num_epochs: 2.0
|
323 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
324 |
## Inference with transformers
|
325 |
|
326 |
```shell
|
@@ -346,6 +374,5 @@ The NeuralChat team with members from Intel/SATG/AIA/AIPT. Core team members: Ka
|
|
346 |
## Useful links
|
347 |
* Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
|
348 |
* Intel Extension for Transformers [link](https://github.com/intel/intel-extension-for-transformers)
|
349 |
-
* Intel Extension for PyTorch [link](https://github.com/intel/intel-extension-for-pytorch)
|
350 |
|
351 |
<!-- original-model-card end -->
|
|
|
5 |
model_creator: Intel
|
6 |
model_name: Neural Chat 7B v3-1
|
7 |
model_type: mistral
|
8 |
+
prompt_template: '### System:
|
9 |
+
|
10 |
+
{system_message}
|
11 |
+
|
12 |
+
|
13 |
+
### User:
|
14 |
+
|
15 |
+
{prompt}
|
16 |
+
|
17 |
+
|
18 |
+
### Assistant:
|
19 |
|
20 |
'
|
21 |
quantized_by: TheBloke
|
|
|
79 |
<!-- repositories-available end -->
|
80 |
|
81 |
<!-- prompt-template start -->
|
82 |
+
## Prompt template: Orca-Hashes
|
83 |
|
84 |
```
|
85 |
+
### System:
|
86 |
+
{system_message}
|
87 |
+
|
88 |
+
### User:
|
89 |
{prompt}
|
90 |
|
91 |
+
### Assistant:
|
92 |
+
|
93 |
```
|
94 |
|
95 |
<!-- prompt-template end -->
|
|
|
207 |
Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
|
208 |
|
209 |
```shell
|
210 |
+
./main -ngl 32 -m neural-chat-7b-v3-1.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### System:\n{system_message}\n\n### User:\n{prompt}\n\n### Assistant:"
|
211 |
```
|
212 |
|
213 |
Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
|
|
|
301 |
# Original model card: Intel's Neural Chat 7B v3-1
|
302 |
|
303 |
|
304 |
+
## Fine-tuning on [Habana](https://habana.ai/) Gaudi2
|
305 |
|
306 |
+
This model is a fine-tuned model based on [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the open source dataset [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca). Then we align it with DPO algorithm. For more details, you can refer our blog: [The Practice of Supervised Fine-tuning and Direct Preference Optimization on Habana Gaudi2](https://medium.com/@NeuralCompressor/the-practice-of-supervised-finetuning-and-direct-preference-optimization-on-habana-gaudi2-a1197d8a3cd3).
|
307 |
|
308 |
## Model date
|
309 |
Neural-chat-7b-v3-1 was trained between September and October, 2023.
|
|
|
333 |
- total_train_batch_size: 64
|
334 |
- total_eval_batch_size: 8
|
335 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
336 |
+
- lr_scheduler_type: cosine
|
337 |
+
- lr_scheduler_warmup_ratio: 0.03
|
338 |
- num_epochs: 2.0
|
339 |
|
340 |
+
## Prompt Template
|
341 |
+
|
342 |
+
```
|
343 |
+
### System:
|
344 |
+
{system}
|
345 |
+
### User:
|
346 |
+
{usr}
|
347 |
+
### Assistant:
|
348 |
+
|
349 |
+
```
|
350 |
+
|
351 |
+
|
352 |
## Inference with transformers
|
353 |
|
354 |
```shell
|
|
|
374 |
## Useful links
|
375 |
* Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
|
376 |
* Intel Extension for Transformers [link](https://github.com/intel/intel-extension-for-transformers)
|
|
|
377 |
|
378 |
<!-- original-model-card end -->
|