rinna
/

japanese-gpt-neox-3.6b

Text Generation

text-generation-inference

Model card Files Files and versions Community

tianyuz commited on May 29, 2023

Commit

72440c8

•

1 Parent(s): e993984

update readme

Files changed (1) hide show

README.md +16 -10

README.md CHANGED Viewed

@@ -20,15 +20,25 @@ inference: false
 ![rinna-icon](./rinna.png)
 # Overview
-This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters. The model was trained using code based on [EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
-## Model architecture
-A 36-layer, 2816-hidden-size transformer-based language model.
-# Pre-training
-The model was trained on around **312.5B** tokens from [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese C4](https://huggingface.co/datasets/mc4), and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective.
-A final validation perplexity of **8.68** has been reached.
 # How to use the model
@@ -89,9 +99,5 @@ The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based
     # 'გამარ[UNK]ობა 吾輩は 猫である </s>'
     ~~~
-# Authors
-* [Tianyu Zhao](https://huggingface.co/tianyuz)
-* [Kei Sawada](https://huggingface.co/keisawada)
 # Licenese
 [The MIT license](https://opensource.org/licenses/MIT)

 ![rinna-icon](./rinna.png)
 # Overview
+This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters.
+* **Library**
+    The model was trained using code based on [EleutherAI/gpt-neox](https://github.com/EleutherAI/gpt-neox).
+* **Model architecture**
+    A 36-layer, 2816-hidden-size transformer-based language model.
+* **Pre-training**
+    The model was trained on around **312.5B** tokens from [Japanese CC-100](http://data.statmt.org/cc-100/ja.txt.xz), [Japanese C4](https://huggingface.co/datasets/mc4), and [Japanese Wikipedia](https://dumps.wikimedia.org/other/cirrussearch) to optimize a traditional language modelling objective.
+    A final validation perplexity of **8.68** has been reached.
+* **Authors**
+    [Tianyu Zhao](https://huggingface.co/tianyuz) and [Kei Sawada](https://huggingface.co/keisawada)
 # How to use the model
     # 'გამარ[UNK]ობა 吾輩は 猫である </s>'
     ~~~
 # Licenese
 [The MIT license](https://opensource.org/licenses/MIT)