tianyuz commited on
Commit
e6d32e6
1 Parent(s): e1d570e

update readme

Browse files
Files changed (1) hide show
  1. README.md +19 -11
README.md CHANGED
@@ -18,8 +18,27 @@ inference: false
18
 
19
  ![rinna-icon](./rinna.png)
20
 
 
21
  This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters. The model is based on [`rinna/japanese-gpt-neox-3.6b`](https://huggingface.co/rinna/japanese-gpt-neox-3.6b) and has been finetuned to serve as a instruction-following conversational agent.
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  A special format has been adopted to construct inputs.
24
  * An input prompt is formatted as a conversation between `ユーザー` and `システム`.
25
  * Each input utterance consists of (1) its speaker (`"ユーザー"` or `"システム"`), (2) a colon (`":"`), (3) a whitespace (`" "`), and (4) utterance text (e.g. `"世界で一番高い山は?"`).
@@ -93,17 +112,6 @@ print(output)
93
  4. 道玄坂です。道玄坂は、日本の商業地区である坂道です。</s>"""
94
  ~~~~
95
 
96
- # Model architecture
97
- A 36-layer, 2816-hidden-size transformer-based language model.
98
-
99
- # Finetuning
100
- The finetuning data is the subset of the following datasets and has been translated into Japanese.
101
- * [Anthropic HH RLHF data](https://huggingface.co/datasets/Anthropic/hh-rlhf)
102
- * [FLAN Instruction Tuning data](https://github.com/google-research/FLAN)
103
- * [Stanford Human Preferences Dataset](https://huggingface.co/datasets/stanfordnlp/SHP)
104
-
105
- The data will **not** be released.
106
-
107
  # Tokenization
108
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer.
109
  * The tokenizer has a vocabulary size of 32,000.
 
18
 
19
  ![rinna-icon](./rinna.png)
20
 
21
+ # Overview
22
  This repository provides a Japanese GPT-NeoX model of 3.6 billion parameters. The model is based on [`rinna/japanese-gpt-neox-3.6b`](https://huggingface.co/rinna/japanese-gpt-neox-3.6b) and has been finetuned to serve as a instruction-following conversational agent.
23
 
24
+ * **Model architecture**
25
+
26
+ A 36-layer, 2816-hidden-size transformer-based language model.
27
+
28
+ * **Finetuning**
29
+
30
+ The finetuning data is the subset of the following datasets and has been translated into Japanese.
31
+ * [Anthropic HH RLHF data](https://huggingface.co/datasets/Anthropic/hh-rlhf)
32
+ * [FLAN Instruction Tuning data](https://github.com/google-research/FLAN)
33
+ * [Stanford Human Preferences Dataset](https://huggingface.co/datasets/stanfordnlp/SHP)
34
+
35
+ The data will **not** be released.
36
+
37
+ * **Authors**
38
+
39
+ [Tianyu Zhao](https://huggingface.co/tianyuz) and [Kei Sawada](https://huggingface.co/keisawada)
40
+
41
+ # I/O Format
42
  A special format has been adopted to construct inputs.
43
  * An input prompt is formatted as a conversation between `ユーザー` and `システム`.
44
  * Each input utterance consists of (1) its speaker (`"ユーザー"` or `"システム"`), (2) a colon (`":"`), (3) a whitespace (`" "`), and (4) utterance text (e.g. `"世界で一番高い山は?"`).
 
112
  4. 道玄坂です。道玄坂は、日本の商業地区である坂道です。</s>"""
113
  ~~~~
114
 
 
 
 
 
 
 
 
 
 
 
 
115
  # Tokenization
116
  The model uses a [sentencepiece](https://github.com/google/sentencepiece)-based tokenizer.
117
  * The tokenizer has a vocabulary size of 32,000.