yentinglin
/

Llama-3-Taiwan-70B-Instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yentinglin commited on Jun 4

Commit

3838e02

•

1 Parent(s): a9ea6ae

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ Llama-3-Taiwan-70B is a large language model finetuned for Traditional Mandarin
 - Inference Framework: [NVIDIA TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM)
 - Base model: [Llama-3 70B](https://llama.meta.com/llama3/)
 - Hardware: [NVIDIA DGX H100](https://www.nvidia.com/zh-tw/data-center/dgx-h100/) on Taipei-1
-- Context length: 8K tokens (Large-context model coming soon)
 - Batch size: 2M tokens per step
 # Evaluation

 - Inference Framework: [NVIDIA TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM)
 - Base model: [Llama-3 70B](https://llama.meta.com/llama3/)
 - Hardware: [NVIDIA DGX H100](https://www.nvidia.com/zh-tw/data-center/dgx-h100/) on Taipei-1
+- Context length: 8K tokens ([128k version](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-128k))
 - Batch size: 2M tokens per step
 # Evaluation