h2oai
/

h2o-danube2-1.8b-sft

Text Generation

large language model

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

psinger commited on Apr 22

Commit

f7c8f18

•

1 Parent(s): 9013ff4

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -76,6 +76,12 @@ res = pipe(
 print(res[0]["generated_text"])
 ```
 ## Quantization and sharding
 You can load the models using quantization by specifying ```load_in_8bit=True``` or ```load_in_4bit=True```. Also, sharding on multiple GPUs is possible by setting ```device_map=auto```.

 print(res[0]["generated_text"])
 ```
+This will apply and run the correct prompt format out of the box:
+```
+<|prompt|>Why is drinking water so healthy?</s><|answer|>
+```
 ## Quantization and sharding
 You can load the models using quantization by specifying ```load_in_8bit=True``` or ```load_in_4bit=True```. Also, sharding on multiple GPUs is possible by setting ```device_map=auto```.