Syed-Hasan-8503
commited on
Commit
•
d8b9180
1
Parent(s):
5043722
Update README.md
Browse files
README.md
CHANGED
@@ -11,12 +11,6 @@ library_name: transformers
|
|
11 |
This repository contains a compressed version of the Meta Llama-3-8B-Instruct model, utilizing the Palu framework for KV-Cache compression. Palu reduces the hidden dimensions of the KV-Cache through low-rank decomposition, significantly reducing the model's memory footprint while maintaining or enhancing performance.
|
12 |
|
13 |
|
14 |
-
## Evaluation Results
|
15 |
-
|
16 |
-
Here's a Markdown file to include the results of your comparisons:
|
17 |
-
|
18 |
-
---
|
19 |
-
|
20 |
# Meta Llama-3-8B-Instruct: Palu Compression Results
|
21 |
|
22 |
## Perplexity (PPL)
|
|
|
11 |
This repository contains a compressed version of the Meta Llama-3-8B-Instruct model, utilizing the Palu framework for KV-Cache compression. Palu reduces the hidden dimensions of the KV-Cache through low-rank decomposition, significantly reducing the model's memory footprint while maintaining or enhancing performance.
|
12 |
|
13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
# Meta Llama-3-8B-Instruct: Palu Compression Results
|
15 |
|
16 |
## Perplexity (PPL)
|