yentinglin
/

Llama-3-Taiwan-70B-Instruct

@@ -56,7 +56,7 @@ Checkout [Open TW LLM Leaderboard](https://huggingface.co/spaces/yentinglin/open
 | Model                                                                            | [TMLU](https://arxiv.org/pdf/2403.20180) | Taiwan Truthful QA | [Legal Eval](https://huggingface.co/datasets/lianghsun/tw-legal-benchmark-v1) |  [TW MT-Bench](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2) | Long context | Function Calling | [TMMLU+](https://github.com/iKala/ievals) |
 |---------------------------------------------------------------------------------|--------------|---------------|--------------------|--------------|--------------|-----------------|-----------|
 |      | 學科知識 | 台灣在地化測試 | 台灣法律考題 |  中文多輪對答 | 長文本支援 | 函數呼叫 |  |
-| [**yentinglin/Llama-3-Taiwan-70B-Instruct-rc3**](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3)     | **74.72%**       |     **79.37%**          |      **60.77%**              |      7.54        |    [128k coming soon](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-128k)          |        ✅         |     67.53%      |
 | [**yentinglin/Llama-3-Taiwan-8B-Instruct-rc1**](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct-rc1) | 59.92%       |    60.32%           |         42.11%           |     7.21         |     [128k coming soon](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct-128k)         |        ✅         |    52.28%       |
 | [Claude-3-Opus](https://www.anthropic.com/api) | [73.59% (5-shot)](https://arxiv.org/pdf/2403.20180)       |  [69.84%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/opus-Taiwan-Truthful-QA)    |     [60.29%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/opus)      |       -       |      200k       |        ✅         |     -      |
 | [GPT4-o](https://platform.openai.com/docs/api-reference/chat/create) | [65.56% (0-shot), 69.88% (5-shot)](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o-tmlu) | [76.98%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o-Taiwan-Truthful-QA)  |    [53.59%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o)   | -  |      128k        |        ✅         |    -  |

 | Model                                                                            | [TMLU](https://arxiv.org/pdf/2403.20180) | Taiwan Truthful QA | [Legal Eval](https://huggingface.co/datasets/lianghsun/tw-legal-benchmark-v1) |  [TW MT-Bench](https://huggingface.co/datasets/MediaTek-Research/TCEval-v2) | Long context | Function Calling | [TMMLU+](https://github.com/iKala/ievals) |
 |---------------------------------------------------------------------------------|--------------|---------------|--------------------|--------------|--------------|-----------------|-----------|
 |      | 學科知識 | 台灣在地化測試 | 台灣法律考題 |  中文多輪對答 | 長文本支援 | 函數呼叫 |  |
+| [**yentinglin/Llama-3-Taiwan-70B-Instruct**](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3)     | **74.72%**       |     **79.37%**          |      **60.77%**              |      7.54        |    [128k version](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-128k)          |        ✅         |     67.53%      |
 | [**yentinglin/Llama-3-Taiwan-8B-Instruct-rc1**](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct-rc1) | 59.92%       |    60.32%           |         42.11%           |     7.21         |     [128k coming soon](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct-128k)         |        ✅         |    52.28%       |
 | [Claude-3-Opus](https://www.anthropic.com/api) | [73.59% (5-shot)](https://arxiv.org/pdf/2403.20180)       |  [69.84%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/opus-Taiwan-Truthful-QA)    |     [60.29%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/opus)      |       -       |      200k       |        ✅         |     -      |
 | [GPT4-o](https://platform.openai.com/docs/api-reference/chat/create) | [65.56% (0-shot), 69.88% (5-shot)](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o-tmlu) | [76.98%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o-Taiwan-Truthful-QA)  |    [53.59%](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct-rc3/tree/main/4o)   | -  |      128k        |        ✅         |    -  |