Update README.md
Browse filesadd itrex inference eg
README.md
CHANGED
@@ -42,6 +42,22 @@ python3 main.py \
|
|
42 |
|
43 |
|
44 |
### Use the model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
|
46 |
### INT4 Inference with AutoGPTQ
|
47 |
|
|
|
42 |
|
43 |
|
44 |
### Use the model
|
45 |
+
### INT4 Inference with ITREX on CPU
|
46 |
+
Install the latest [intel-extension-for-transformers](https://github.com/intel/intel-extension-for-transformers)
|
47 |
+
```python
|
48 |
+
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
|
49 |
+
from transformers import AutoTokenizer
|
50 |
+
quantized_model_dir = "Intel/neural-chat-7b-v3-1-int4-inc"
|
51 |
+
model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
|
52 |
+
device_map="auto",
|
53 |
+
trust_remote_code=False,
|
54 |
+
use_neural_speed=False,
|
55 |
+
)
|
56 |
+
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
|
57 |
+
print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adventure,", return_tensors="pt").to(model.device),max_new_tokens=50)[0]))
|
58 |
+
<s> There is a girl who likes adventure, who loves to travel, who is always looking for new experiences. She is a dreamer, a doer, a thinker, a believer. She is a girl who is not afraid to take risks, to make mistakes, to learn from
|
59 |
+
```
|
60 |
+
|
61 |
|
62 |
### INT4 Inference with AutoGPTQ
|
63 |
|