Model Card for FLAN-T5 XL

drawing

TL;DR
Model Details
Usage
Uses
Bias, Risks, and Limitations
Training Details
Evaluation
Environmental Impact
Citation

TL;DR

If you already know T5, FLAN-T5 is just better at everything. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. As mentioned in the first few lines of the abstract :

Flan-PaLM 540B achieves state-of-the-art performance on several benchmarks, such as 75.2% on five-shot MMLU. We also publicly release Flan-T5 checkpoints,1 which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B. Overall, instruction finetuning is a general method for improving the performance and usability of pretrained language models.

Disclaimer: Content from this model card has been written by the Hugging Face team, and parts of it were copy pasted from the T5 model card.

Model Details

Model Description

The details are in the original google/flan-t5-xl

deepfile
/

flan-t5-xl-gguf

Model Card for FLAN-T5 XL

Table of Contents

TL;DR

Model Details

Model Description

Datasets used to train deepfile/flan-t5-xl-gguf