unsubscribe commited on
Commit
f39ab7a
1 Parent(s): c52ae5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -24,7 +24,9 @@ ______________________________________________________________________
24
 
25
  ## News 🎉
26
 
27
- - \[2023/08\] TurboMind supports 4-bit quantization and inference.
 
 
28
  - \[2023/07\] TurboMind supports Llama-2 70B with GQA.
29
  - \[2023/07\] TurboMind supports Llama-2 7B/13B.
30
  - \[2023/07\] TurboMind supports tensor-parallel inference of InternLM.
 
24
 
25
  ## News 🎉
26
 
27
+ - \[2023/08\] Turbomind supports 4-bit inference, 2.4x faster than FP16, the fastest open-source implementation🚀.
28
+ - \[2023/08\] LMDeploy has launched on the [HuggingFace Hub](https://huggingface.co/lmdeploy), providing ready-to-use 4-bit models.
29
+ - \[2023/08\] LMDeploy supports 4-bit quantization using the [AWQ](https://arxiv.org/abs/2306.00978) algorithm.
30
  - \[2023/07\] TurboMind supports Llama-2 70B with GQA.
31
  - \[2023/07\] TurboMind supports Llama-2 7B/13B.
32
  - \[2023/07\] TurboMind supports tensor-parallel inference of InternLM.