woodchen7 wellzhong commited on
Commit
7f8c549
1 Parent(s): 08d72b4

Update README.md (#12)

Browse files

- Update README.md (df03933f76603afbe5f540f85e76095cabca4953)


Co-authored-by: wellzhong <wellzhong@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -18,6 +18,40 @@ license_link: https://huggingface.co/tencent/Tencent-Hunyuan-Large/blob/main/LIC
18
  </p><p align="center">
19
  <a href="https://arxiv.org/abs/2411.02265" style="color: blue;"><b>Technical Report</b></a>&nbsp&nbsp|&nbsp&nbsp <a href="https://huggingface.co/spaces/tencent/Hunyuan-Large"><b>Demo</b></a>&nbsp&nbsp&nbsp|&nbsp&nbsp <a href="https://cloud.tencent.com/document/product/851/112032" style="color: blue;"><b>Tencent Cloud TI</b></a>&nbsp&nbsp&nbsp</p>
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ### Model Introduction
22
 
23
  With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.
 
18
  </p><p align="center">
19
  <a href="https://arxiv.org/abs/2411.02265" style="color: blue;"><b>Technical Report</b></a>&nbsp&nbsp|&nbsp&nbsp <a href="https://huggingface.co/spaces/tencent/Hunyuan-Large"><b>Demo</b></a>&nbsp&nbsp&nbsp|&nbsp&nbsp <a href="https://cloud.tencent.com/document/product/851/112032" style="color: blue;"><b>Tencent Cloud TI</b></a>&nbsp&nbsp&nbsp</p>
20
 
21
+
22
+
23
+ <p>
24
+ <table align="center">
25
+ <tbody>
26
+ <tr align="center">
27
+ <td align="center" colspan="3"><strong>Download Models</strong></td>
28
+ </tr>
29
+ <tr align="center">
30
+ <td align="center" style="width: 200px;" ><strong>Models</strong></td>
31
+ <td align="center" style="width: 400px;"><strong>Huggingface Download URL</strong></td>
32
+ <td align="center" style="width: 400px;"><strong>Tencent Cloud Download URL</strong></td>
33
+ </tr>
34
+ <tr align="center">
35
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Instruct-FP8</td>
36
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Instruct-FP8" ;">Hunyuan-A52B-Instruct-FP8</a></td>
37
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Instruct-128k-fp8.zip" ;">Hunyuan-A52B-Instruct-FP8</a></td>
38
+ </tr>
39
+ <tr align="center">
40
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Instruct</td>
41
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Instruct" ;">Hunyuan-A52B-Instruct</a></td>
42
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Instruct-128k.zip" ;">Hunyuan-A52B-Instruct</a></td>
43
+ </tr>
44
+ <tr align="center">
45
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Pretrain</td>
46
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Pretrain" ;">Hunyuan-A52B-Pretrain</a></td>
47
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Pretrain-256k.zip" ;">Hunyuan-A52B-Pretrain</a></td>
48
+ </tr>
49
+ </tbody>
50
+ </table>
51
+ </p>
52
+
53
+
54
+
55
  ### Model Introduction
56
 
57
  With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.