llama2.go

this is a model for llama2.go project, this project can run meta LLaMA model in less memory like Raspberry Pi.

memory usage

Model	Precision	Memory	Memory(Cached Params)
7B	bf16	600M+	25G+
13B	bf16	1G+	43G+
70B	bf16	3G+	untest

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Inference API (serverless) has been turned off for this model.