ybelkada commited on
Commit
266006c
1 Parent(s): be6ac3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -14
README.md CHANGED
@@ -68,7 +68,7 @@ tags:
68
  datasets:
69
  - svakulenk0/qrecc
70
  - taskmaster2
71
- - djaym7/wiki_dialog
72
  - deepmind/code_contests
73
  - lambada
74
  - gsm8k
@@ -87,6 +87,7 @@ license: apache-2.0
87
 
88
  # Table of Contents
89
 
 
90
  1. [Model Details](#model-details)
91
  2. [Usage](#usage)
92
  3. [Uses](#uses)
@@ -96,28 +97,25 @@ license: apache-2.0
96
  7. [Environmental Impact](#environmental-impact)
97
  8. [Citation](#citation)
98
  9. [Model Card Authors](#model-card-authors)
99
- 10. [How To Get Started With the Model](#how-to-get-started-with-the-model)
100
 
101
- # Model Details
102
 
103
- ## Model Description
104
 
105
- The developers of the Text-To-Text Transfer Transformer (T5) [write](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html):
106
 
107
- > With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task.
108
 
109
- T5-Base is the checkpoint with 220 million parameters.
110
 
111
- - **Developed by:** Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane, Gu Zhuyun, Dai Mirac, Suzgun Xinyun, Chen Aakanksha, Chowdhery Sharan, Narang Gaurav, Mishra Adams, Yu Vincent, Zhao Yanping, Huang Andrew, Dai Hongkun, Yu Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts; Denny Zhou, Quoc V. Le, Jason Wei∗ See [associated paper](https://arxiv.org/pdf/2210.11416.pdf) and [GitHub repo](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
112
  - **Model type:** Language model
113
- - **Language(s) (NLP):** English, French, Romanian, German
114
  - **License:** Apache 2.0
115
- - **Related Models:** [All T5 Checkpoints](https://huggingface.co/models?search=t5)
 
116
  - **Resources for more information:**
117
- - [Research paper](https://jmlr.org/papers/volume21/20-074/20-074.pdf)
118
- - [Google's T5 Blog Post](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html)
119
- - [GitHub Repo](https://github.com/google-research/text-to-text-transfer-transformer)
120
- - [Hugging Face T5 Docs](https://huggingface.co/docs/transformers/model_doc/t5)
121
 
122
  # Usage
123
 
 
68
  datasets:
69
  - svakulenk0/qrecc
70
  - taskmaster2
71
+ - djaym7/wiki_dialog
72
  - deepmind/code_contests
73
  - lambada
74
  - gsm8k
 
87
 
88
  # Table of Contents
89
 
90
+ 0. [TL;DR](#TL;DR)
91
  1. [Model Details](#model-details)
92
  2. [Usage](#usage)
93
  3. [Uses](#uses)
 
97
  7. [Environmental Impact](#environmental-impact)
98
  8. [Citation](#citation)
99
  9. [Model Card Authors](#model-card-authors)
 
100
 
101
+ # TL;DR
102
 
103
+ > Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.
104
 
105
+ # Model Details
106
 
107
+ ## Model Description
108
 
 
109
 
 
110
  - **Model type:** Language model
111
+ - **Language(s) (NLP):** English, Spanish, Japanese, Persian, Hindi, French, Chinese, Bengali, Gujarati, German, Telugu, Italian, Arabic, Polish, Tamil, Marathi, Malayalam, Oriya, Panjabi, Portuguese, Urdu, Galician, Hebrew, Korean, Catalan, Thai, Dutch, Indonesian, Vietnamese, Bulgarian, Filipino, Central Khmer, Lao, Turkish, Russian, Croatian, Swedish, Yoruba, Kurdish, Burmese, Malay, Czech, Finnish, Somali, Tagalog, Swahili, Sinhala, Kannada, Zhuang, Igbo, Xhosa, Romanian, Haitian, Estonian, Slovak, Lithuanian, Greek, Nepali, Assamese, Norwegian
112
  - **License:** Apache 2.0
113
+ - **Related Models:** [All FLAN-T5 Checkpoints](https://huggingface.co/models?search=flan-t5)
114
+ - **Related Original Models:** [All Original FLAN-T5 Checkpoints](https://github.com/google-research/t5x/blob/main/docs/models.md#flan-t5-checkpoints)
115
  - **Resources for more information:**
116
+ - [Research paper](https://arxiv.org/pdf/2210.11416.pdf)
117
+ - [GitHub Repo](https://github.com/google-research/t5x)
118
+ - [Hugging Face FLAN-T5 Docs (Similar to T5) ](https://huggingface.co/docs/transformers/model_doc/t5)
 
119
 
120
  # Usage
121