|
--- |
|
license: mit |
|
tags: |
|
- text-generation-inference |
|
- Transformer |
|
- large-language-model |
|
- generative AI |
|
- on-device-computing |
|
- edge-computing |
|
--- |
|
**QuicktypeGPT is an on-device C-written large language model (LLM) to assist you typing quicker and carrying out meaningful conversations.** |
|
|
|
This model only has 15M parameters (dim = 288, 6 layers, 6 heads and 6 kv heads) and 27MB. The model is pre-trained on a single A40 GPU and can be inferenced through a pure C program on a laptop CPU (e.g. AMD, Intel) with decent quality and speed. This project is to demonstrate that: |
|
- We do not need to train a very sophisticated LLM but can still achieve santisfactory performance if the LLM is only focused on a small and dedicated domain or task. |
|
- We can deploy small LLMs on edge devices (e.g. desktop, laptop, tablet or phone) to perform inference tasks without relying on the servers in the cloud. |
|
|
|
For more details, please refer to [quicktypeGPT](https://github.com/chaoluond/quicktypeGPT) github project. |