Migrate model card from transformers-repo
Browse filesRead announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp/README.md
README.md
ADDED
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: multilingual
|
3 |
+
thumbnail:
|
4 |
+
---
|
5 |
+
|
6 |
+
# A fine-tuned model on GoldP task from Tydi QA dataset
|
7 |
+
|
8 |
+
This model uses [bert-multi-cased-finetuned-xquadv1](https://huggingface.co/mrm8488/bert-multi-cased-finetuned-xquadv1) and fine-tuned on [Tydi QA](https://github.com/google-research-datasets/tydiqa) dataset for Gold Passage task [(GoldP)](https://github.com/google-research-datasets/tydiqa#the-tasks)
|
9 |
+
|
10 |
+
## Details of the language model
|
11 |
+
The base language model [(bert-multi-cased-finetuned-xquadv1)](https://huggingface.co/mrm8488/bert-multi-cased-finetuned-xquadv1) is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) for the **Q&A** downstream task
|
12 |
+
|
13 |
+
|
14 |
+
## Details of the Tydi QA dataset
|
15 |
+
|
16 |
+
TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the **training and evaluation** of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset. https://ai.google.com/research/tydiqa
|
17 |
+
|
18 |
+
|
19 |
+
## Details of the downstream task (Gold Passage or GoldP aka the secondary task)
|
20 |
+
|
21 |
+
Given a passage that is guaranteed to contain the answer, predict the single contiguous span of characters that answers the question. the gold passage task differs from the [primary task](https://github.com/google-research-datasets/tydiqa/blob/master/README.md#the-tasks) in several ways:
|
22 |
+
* only the gold answer passage is provided rather than the entire Wikipedia article;
|
23 |
+
* unanswerable questions have been discarded, similar to MLQA and XQuAD;
|
24 |
+
* we evaluate with the SQuAD 1.1 metrics like XQuAD; and
|
25 |
+
* Thai and Japanese are removed since the lack of whitespace breaks some tools.
|
26 |
+
|
27 |
+
|
28 |
+
## Model training
|
29 |
+
|
30 |
+
The model was fine-tuned on a Tesla P100 GPU and 25GB of RAM.
|
31 |
+
The script is the following:
|
32 |
+
|
33 |
+
```python
|
34 |
+
python run_squad.py \
|
35 |
+
--model_type bert \
|
36 |
+
--model_name_or_path mrm8488/bert-multi-cased-finetuned-xquadv1 \
|
37 |
+
--do_train \
|
38 |
+
--do_eval \
|
39 |
+
--train_file /content/dataset/train.json \
|
40 |
+
--predict_file /content/dataset/dev.json \
|
41 |
+
--per_gpu_train_batch_size 24 \
|
42 |
+
--per_gpu_eval_batch_size 24 \
|
43 |
+
--learning_rate 3e-5 \
|
44 |
+
--num_train_epochs 2.5 \
|
45 |
+
--max_seq_length 384 \
|
46 |
+
--doc_stride 128 \
|
47 |
+
--output_dir /content/model_output \
|
48 |
+
--overwrite_output_dir \
|
49 |
+
--save_steps 5000 \
|
50 |
+
--threads 40
|
51 |
+
```
|
52 |
+
|
53 |
+
## Global Results (dev set):
|
54 |
+
|
55 |
+
| Metric | # Value |
|
56 |
+
| --------- | ----------- |
|
57 |
+
| **Exact** | **71.06** |
|
58 |
+
| **F1** | **82.16** |
|
59 |
+
|
60 |
+
## Specific Results (per language):
|
61 |
+
|
62 |
+
| Language | # Samples | # Exact | # F1 |
|
63 |
+
| --------- | ----------- |--------| ------ |
|
64 |
+
| Arabic | 1314 | 73.29 | 84.72 |
|
65 |
+
| Bengali | 180 | 64.60 | 77.84 |
|
66 |
+
| English | 654 | 72.12 | 82.24 |
|
67 |
+
| Finnish | 1031 | 70.14 | 80.36 |
|
68 |
+
| Indonesian| 773 | 77.25 | 86.36 |
|
69 |
+
| Korean | 414 | 68.92 | 70.95 |
|
70 |
+
| Russian | 1079 | 62.65 | 78.55 |
|
71 |
+
| Swahili | 596 | 80.11 | 86.18 |
|
72 |
+
| Telegu | 874 | 71.00 | 84.24 |
|
73 |
+
|
74 |
+
|
75 |
+
|
76 |
+
> Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
|
77 |
+
|
78 |
+
> Made with <span style="color: #e25555;">♥</span> in Spain
|