safetyllm
/

quickertype

text-generation-inference

large-language-model

on-device-computing

Model card Files Files and versions Community

quickertype / README.md

safetyllm's picture

Update README.md

79dba58 verified 10 months ago

|

1.01 kB

	---
	license: mit
	tags:
	- text-generation-inference
	- Transformer
	- large-language-model
	- generative AI
	- on-device-computing
	- edge-computing
	---
	QuicktypeGPT is an on-device C-written large language model (LLM) to assist you typing quicker and carrying out meaningful conversations.

	This model only has 15M parameters (dim = 288, 6 layers, 6 heads and 6 kv heads) and 27MB. The model is pre-trained on a single A40 GPU and can be inferenced through a pure C program on a laptop CPU (e.g. AMD, Intel) with decent quality and speed. This project is to demonstrate that:
	- We do not need to train a very sophisticated LLM but can still achieve santisfactory performance if the LLM is only focused on a small and dedicated domain or task.
	- We can deploy small LLMs on edge devices (e.g. desktop, laptop, tablet or phone) to perform inference tasks without relying on the servers in the cloud.

	For more details, please refer to [quicktypeGPT](https://github.com/chaoluond/quicktypeGPT) github project.