Update README.md
Browse files
README.md
CHANGED
@@ -11,32 +11,39 @@ This model adapts T5 on the Arabic Language by pre-training T5 on :
|
|
11 |
|
12 |
Total Corpora size is 17GB. We restrict our corpora to News and Encyclopedias to enhance the performance of the model on informative tasks such as Factoid Question Answering and Generative task that uses classic Arabic ( الفصحى ). This also gives our models an advantage if you don't want the generative text to contain inappropriate language. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) .
|
13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
15 |
|
16 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
17 |
|------------------|--------------|-------------|---------------|-------|-----------|---------------|--------|-----------------------|------------------------|
|
18 |
-
| AraT5-
|
19 |
-
| AraT5-
|
20 |
-
| AraT5-
|
21 |
-
| AraBART-
|
22 |
-
| mT5-
|
23 |
-
| ArabicT5-
|
24 |
-
| ArabicT5-
|
25 |
-
| ArabicT5-
|
26 |
|
27 |
|
28 |
## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
|
29 |
|
30 |
| Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
|
31 |
|----------------------|---------------|---------------------|-------------------------------------|----------------------------------|----------------------------------
|
32 |
-
| AraT5-
|
33 |
-
| AraT5-
|
34 |
-
| AraT5-
|
35 |
-
| mT5-
|
36 |
-
| AraBART-
|
37 |
-
| ArabicT5-
|
38 |
-
| ArabicT5-
|
39 |
-
| ArabicT5-
|
40 |
|
41 |
Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
|
42 |
|
@@ -56,14 +63,14 @@ Please note these results when we fixed our hyperparameters for all models. Refe
|
|
56 |
|
57 |
| <center>Model | <center>Run Time (hh:mm:ss) | <center>Results on TyDi QA |
|
58 |
|----------------------|---------------|---------------------|
|
59 |
-
| AraT5-
|
60 |
-
| AraT5-
|
61 |
-
| AraT5-
|
62 |
-
| mT5-
|
63 |
-
| AraBART-
|
64 |
-
| ArabicT5-
|
65 |
-
| ArabicT5-
|
66 |
-
| ArabicT5-
|
67 |
|
68 |
|
69 |
Please note that we can further speed up our ArabicT5-Base by increasing the batch size since it could handle larger batch size than other base-scale models due to its hidden layer size (512).
|
|
|
11 |
|
12 |
Total Corpora size is 17GB. We restrict our corpora to News and Encyclopedias to enhance the performance of the model on informative tasks such as Factoid Question Answering and Generative task that uses classic Arabic ( الفصحى ). This also gives our models an advantage if you don't want the generative text to contain inappropriate language. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) .
|
13 |
|
14 |
+
```diff
|
15 |
+
- We changed the name of our model to match the original paper's naming (https://arxiv.org/abs/2109.10686) refer to page 8, Table 4.
|
16 |
+
|
17 |
+
ArabicT5-Base --> ArabicT5-17GB-small
|
18 |
+
ArabicT5-Large --> ArabicT5-17GB-base
|
19 |
+
ArabicT5-xLarge --> ArabicT5-17GB-large
|
20 |
+
```
|
21 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
22 |
|
23 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
24 |
|------------------|--------------|-------------|---------------|-------|-----------|---------------|--------|-----------------------|------------------------|
|
25 |
+
| AraT5-base | 768 | 12 | 12 | 110K |TPUv3-8 | 1M | 128 | 1.0x |248GB 29B tokens (MSA + Tweets) |
|
26 |
+
| AraT5-msa-base | 768 | 12 | 12 | 110K |TPUv3-8 | 1M | 128 | 1.0x |70GB (MSA) |
|
27 |
+
| AraT5-tweets-base| 768 | 12 | 12 | 110K |TPUv3-8 | 1M | 128 | 1.0x |178GB (Tweets) |
|
28 |
+
| AraBART-base | 768 | 12 | 12 | 50K | 128 V100 GPUs (60h) |25 epochs| - | - |73GB (MSA) |
|
29 |
+
| mT5-base | 768 | 12 | 12 | 250K |TPUv3-32 | 1M | 1024 | 8.0x |6.3T tokens (mC4)|
|
30 |
+
| ArabicT5-17GB-small | 512 | 8 | 20 | 32K |TPUv3-32 | 256K | 256 | 0.5x |17GB (MSA) |
|
31 |
+
| ArabicT5-17GB-base | 768 | 12 | 16 | 32K |TPUv3-128 | 500K | 512 | 2.0x |17GB (MSA) |
|
32 |
+
| ArabicT5-17GB-large | 768 | 12 | 36 | 32K |TPUv3-128 | 500K | 512 | 2.0x |17GB (MSA) |
|
33 |
|
34 |
|
35 |
## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
|
36 |
|
37 |
| Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
|
38 |
|----------------------|---------------|---------------------|-------------------------------------|----------------------------------|----------------------------------
|
39 |
+
| AraT5-base | <center>70.36/84.21 |<center>96.49|<center>69.7/72.63|<center>60.44|<center>30.31|
|
40 |
+
| AraT5-msa-base | <center>70.90/84.00 |<center>**96.52**|<center>70.03/72.73|<center>60.69|<center>27.36|
|
41 |
+
| AraT5-tweets-base | <center>65.14/79.00 |<center>96.26|<center>70.67/73.52|<center>61.11|<center>25.08|
|
42 |
+
| mT5-base | <center>72.20/84.13 |<center>96.24|<center>67.33/68.78|<center>52.18|<center>25.68|
|
43 |
+
| AraBART-base | <center>48.75/71.15 |<center>96.11|<center>66.23/68.18|<center>56.30|<center>31.20|
|
44 |
+
| ArabicT5-17GB-small | <center>70.79/84.76 |<center>96.36|<center>68.93/71.20|<center>58.93|<center>29.19|
|
45 |
+
| ArabicT5-17GB-base | <center>73.29/86.08 |<center>96.40|<center>70.4/73.01|<center>59.79|<center>30.30|
|
46 |
+
| ArabicT5-17GB-large | <center>**75.46/87.12** |<center>96.50| <center>**72.23/75.17**|<center>**61.66**|<center>**31.70**|
|
47 |
|
48 |
Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
|
49 |
|
|
|
63 |
|
64 |
| <center>Model | <center>Run Time (hh:mm:ss) | <center>Results on TyDi QA |
|
65 |
|----------------------|---------------|---------------------|
|
66 |
+
| AraT5-msa-base | <center>00:20:41 |<center>69.92/82.50|
|
67 |
+
| AraT5-base | <center>00:20:53 |<center>68.40/81.97|
|
68 |
+
| AraT5-base-Tweets | <center>00:21:17 |<center>61.67/75.96|
|
69 |
+
| mT5-base | <center>00:28:24 |<center>57.98/72.81|
|
70 |
+
| AraBART-base | <center>00:10:57 |<center>43.76/66.30|
|
71 |
+
| ArabicT5-17GB-small | <center>00:20:00 |<center>70.79/83.85|
|
72 |
+
| ArabicT5-17GB-base | <center>00:23:50 |<center>71.22/84.42|
|
73 |
+
| ArabicT5-17GB-large | <center>00:52:17 |<center>72.86/86.00|
|
74 |
|
75 |
|
76 |
Please note that we can further speed up our ArabicT5-Base by increasing the batch size since it could handle larger batch size than other base-scale models due to its hidden layer size (512).
|