---
title: README
emoji: 🚀
colorFrom: indigo
colorTo: pink
sdk: static
pinned: false
license: apache-2.0
---
[Github](https://github.com/InternLM/lmdeploy)
English | [简体ä¸æ–‡](https://github.com/InternLM/lmdeploy/blob/main/README_zh-CN.md)
👋 join us on Twitter, Discord and WeChat
______________________________________________________________________
## News 🎉
- \[2023/08\] Turbomind supports 4-bit inference, 2.4x faster than FP16, the fastest open-source implementation🚀.
- \[2023/08\] LMDeploy has launched on the [HuggingFace Hub](https://huggingface.co/lmdeploy), providing ready-to-use 4-bit models.
- \[2023/08\] LMDeploy supports 4-bit quantization using the [AWQ](https://arxiv.org/abs/2306.00978) algorithm.
- \[2023/07\] TurboMind supports Llama-2 70B with GQA.
- \[2023/07\] TurboMind supports Llama-2 7B/13B.
- \[2023/07\] TurboMind supports tensor-parallel inference of InternLM.
______________________________________________________________________