|
--- |
|
base_model: yentinglin/Llama-3-Taiwan-8B-Instruct |
|
language: |
|
- zh |
|
- en |
|
license: llama3 |
|
model_creator: yentinglin |
|
model_name: Llama-3-Taiwan-8B-Instruct |
|
model_type: llama |
|
pipeline_tag: text-generation |
|
quantized_by: minyichen |
|
tags: |
|
- llama-3 |
|
--- |
|
|
|
# Llama-3-Taiwan-8B-Instruct - GPTQ |
|
- Model creator: [Yen-Ting Lin](https://huggingface.co/yentinglin) |
|
- Original model: [Llama-3-Taiwan-8B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct) |
|
|
|
<!-- description start --> |
|
## Description |
|
|
|
This repo contains GPTQ model files for [Llama-3-Taiwan-8B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct). |
|
|
|
<!-- description end --> |
|
<!-- repositories-available start --> |
|
* [GPTQ models for GPU inference](minyichen/Llama-3-Taiwan-8B-Instruct-GPTQ) |
|
* [Yen-Ting Lin's original unquantized model](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct) |
|
<!-- repositories-available end --> |
|
|
|
## Quantization parameter |
|
|
|
- Bits : 4 |
|
- Group Size : 128 |
|
- Act Order : Yes |
|
- Damp % : 0.1 |
|
- Seq Len : 2048 |
|
- Size : 5.34 GB |
|
|
|
It tooks about 50 mins to quantize on H100. |
|
|
|
|
|
|