Update README.md
Browse files
README.md
CHANGED
@@ -46,14 +46,14 @@ Squid employs a decoder-decoder framework with two main components:
|
|
46 |
download this repository and run the following commands:
|
47 |
```bash
|
48 |
git lfs install
|
49 |
-
git clone https://huggingface.co/NexaAIDev/
|
50 |
python inference_example.py
|
51 |
```
|
52 |
|
53 |
### Method 2
|
54 |
-
Install `nexaai-
|
55 |
```
|
56 |
-
pip install nexaai-
|
57 |
```
|
58 |
|
59 |
Then run the following commands:
|
@@ -61,8 +61,8 @@ Then run the following commands:
|
|
61 |
```python
|
62 |
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
|
63 |
import torch
|
64 |
-
from
|
65 |
-
from
|
66 |
|
67 |
|
68 |
def inference_instruct(mycontext, question, device="cuda:0"):
|
@@ -106,8 +106,8 @@ def inference_instruct(mycontext, question, device="cuda:0"):
|
|
106 |
|
107 |
if __name__ == "__main__":
|
108 |
device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
|
109 |
-
AutoConfig.register("
|
110 |
-
AutoModelForCausalLM.register(
|
111 |
tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
|
112 |
model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
|
113 |
|
@@ -119,7 +119,7 @@ if __name__ == "__main__":
|
|
119 |
```
|
120 |
|
121 |
## Training Process
|
122 |
-
|
123 |
1. Restoration Training: Reconstructing original context from compressed embeddings
|
124 |
2. Continual Training: Generating context continuations from partial compressed contexts
|
125 |
3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
|
@@ -127,10 +127,10 @@ Dolphin's training involves three stages:
|
|
127 |
This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
|
128 |
|
129 |
## Citation
|
130 |
-
If you use
|
131 |
|
132 |
```bibtex
|
133 |
-
@article{
|
134 |
title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
|
135 |
author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
|
136 |
year={2024},
|
|
|
46 |
download this repository and run the following commands:
|
47 |
```bash
|
48 |
git lfs install
|
49 |
+
git clone https://huggingface.co/NexaAIDev/Squid
|
50 |
python inference_example.py
|
51 |
```
|
52 |
|
53 |
### Method 2
|
54 |
+
Install `nexaai-squid` package
|
55 |
```
|
56 |
+
pip install nexaai-squid
|
57 |
```
|
58 |
|
59 |
Then run the following commands:
|
|
|
61 |
```python
|
62 |
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
|
63 |
import torch
|
64 |
+
from squid.configuration_squid import SquidConfig
|
65 |
+
from squid.modeling_squid import SquidForCausalLM
|
66 |
|
67 |
|
68 |
def inference_instruct(mycontext, question, device="cuda:0"):
|
|
|
106 |
|
107 |
if __name__ == "__main__":
|
108 |
device_name = "cuda:0" if torch.cuda.is_available() else "cpu"
|
109 |
+
AutoConfig.register("squid", SquidConfig)
|
110 |
+
AutoModelForCausalLM.register(SquidConfig, SquidForCausalLM)
|
111 |
tokenizer = AutoTokenizer.from_pretrained('NexaAIDev/Squid')
|
112 |
model = AutoModelForCausalLM.from_pretrained('NexaAIDev/Squid', trust_remote_code=True, torch_dtype=torch.bfloat16, device_map=device_name)
|
113 |
|
|
|
119 |
```
|
120 |
|
121 |
## Training Process
|
122 |
+
Squid's training involves three stages:
|
123 |
1. Restoration Training: Reconstructing original context from compressed embeddings
|
124 |
2. Continual Training: Generating context continuations from partial compressed contexts
|
125 |
3. Instruction Fine-tuning: Generating responses to queries given compressed contexts
|
|
|
127 |
This multi-stage approach progressively enhances the model's ability to handle long contexts and generate appropriate responses.
|
128 |
|
129 |
## Citation
|
130 |
+
If you use Squid in your research, please cite our paper:
|
131 |
|
132 |
```bibtex
|
133 |
+
@article{chen2024squidlongcontextnew,
|
134 |
title={Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models},
|
135 |
author={Wei Chen and Zhiyuan Li and Shuo Xin and Yihao Wang},
|
136 |
year={2024},
|