second-state
/

SmolLM-360M-Instruct-GGUF

Inference Endpoints

Model card Files Files and versions Community

apepkuss79 commited on Jul 29

Commit

5d70b60

•

1 Parent(s): 103459c

Update README.md

Files changed (1) hide show

README.md +13 -19

README.md CHANGED Viewed

@@ -25,36 +25,30 @@ language:
 ## Run with LlamaEdge
-- LlamaEdge version: coming soon
-<!-- - LlamaEdge version: [v0.12.4](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.4) and above
 - Prompt template
-  - Prompt type: `llama-3-chat`
   - Prompt string
     ```text
-    <|begin_of_text|><|start_header_id|>system<|end_header_id|>
-    {{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>
-    {{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-    {{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>
-    {{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-    ``` -->
 - Context size: `2048`
-<!-- - Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-360M-Instruct-Q5_K_M.gguf \
     llama-api-server.wasm \
-    --prompt-template llama-3-chat \
     --ctx-size 2048 \
     --model-name SmolLM-360M-Instruct
   ```
@@ -64,9 +58,9 @@ language:
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-360M-Instruct-Q5_K_M.gguf \
     llama-chat.wasm \
-    --prompt-template llama-3-chat \
-    --ctx-size 2048 \
-  ``` -->
 ## Quantized GGUF Models

 ## Run with LlamaEdge
+- LlamaEdge version: [v0.12.5](https://github.com/LlamaEdge/LlamaEdge/releases/tag/0.12.5) and above
 - Prompt template
+  - Prompt type: `chatml`
   - Prompt string
     ```text
+    <|im_start|>system
+    {system_message}<|im_end|>
+    <|im_start|>user
+    {prompt}<|im_end|>
+    <|im_start|>assistant
+    ```
 - Context size: `2048`
+- Run as LlamaEdge service
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-360M-Instruct-Q5_K_M.gguf \
     llama-api-server.wasm \
+    --prompt-template chatml \
     --ctx-size 2048 \
     --model-name SmolLM-360M-Instruct
   ```
   ```bash
   wasmedge --dir .:. --nn-preload default:GGML:AUTO:SmolLM-360M-Instruct-Q5_K_M.gguf \
     llama-chat.wasm \
+    --prompt-template chatml \
+    --ctx-size 2048
+  ```
 ## Quantized GGUF Models