noneUsername
/

TouchNight-Ministral-8B-Instruct-2410-HF-W8A8-Dynamic-Per-Token

8-bit precision

compressed-tensors

Model card Files Files and versions Community

noneUsername commited on 19 days ago

Commit

5973ddf

•

1 Parent(s): a28115e

Create README.md

Files changed (1) hide show

README.md +35 -0

README.md ADDED Viewed

	@@ -0,0 +1,35 @@

+---
+base_model:
+- TouchNight/Ministral-8B-Instruct-2410-HF
+---
+It is worth noting that compared with the prince-canuma version, this version is smaller in size after quantization and its accuracy is also improved by one percentage point.
+vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
+|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
+|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
+|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.820|±  |0.0243|
+|     |       |strict-match    |     5|exact_match|↑  |0.816|±  |0.0246|
+vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
+|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
+|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
+|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.804|±  |0.0252|
+|     |       |strict-match    |     5|exact_match|↑  |0.804|±  |0.0252|
+vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float32), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
+|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
+|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
+|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.820|±  |0.0243|
+|     |       |strict-match    |     5|exact_match|↑  |0.816|±  |0.0246|
+vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
+|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
+|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
+|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.816|±  |0.0246|
+|     |       |strict-match    |     5|exact_match|↑  |0.812|±  |0.0248|
+vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
+|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value|   |Stderr|
+|-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
+|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.796|±  |0.0255|
+|     |       |strict-match    |     5|exact_match|↑  |0.792|±  |0.0257|