sultan
/

ArabicT5-17GB-small

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sultan commited on May 31, 2023

Commit

5422f6c

•

1 Parent(s): 083d9e9

Update README.md

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -30,7 +30,28 @@ This model adapt T5 on Arabic Language by pre-training T5 on ArabicWikipedia, Ma
 | ArabicT5-Large       |  <center>73.29/86.08  |<center>96.40|<center>70.4/73.01|<center>59.79|
 | ArabicT5-xLarge      |  <center>**75.46/87.12**  |<center>96.50| <center>**72.23/75.17**|<center>**61.66**|
-Evaluation Metrics : TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic)
 # Paper

 | ArabicT5-Large       |  <center>73.29/86.08  |<center>96.40|<center>70.4/73.01|<center>59.79|
 | ArabicT5-xLarge      |  <center>**75.46/87.12**  |<center>96.50| <center>**72.23/75.17**|<center>**61.66**|
+Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic)
+# Speedup Results
+Below are our speedup results on the TyDi QA dataset where all models are fine-tuned 13 epochs with a learning rate of 2e-4 and batch size of 3 on each device on the TPU (TPU3v-8 batch=3x8->24).
+Please note these results when we fixed our hyperparameters for all models. To get the best results after doing a grid search refer to the table above.
+|    <center>Model            | <center>Run Time (hh:mm:ss) | <center>Results on TyDi QA |
+|----------------------|---------------|---------------------|
+| AraT5-Base-MSA       |  <center>00:20:41  |<center>69.92/82.50|
+| AraT5-Base           |  <center>00:20:53  |<center>68.40/81.97|
+| AraT5-Base-Tweets    |  <center>00:21:17  |<center>61.67/75.96|
+| mT5-Base             |  <center>00:28:24  |<center>57.98/72.81|
+| ArabicT5-Base        |  <center>00:20:00  |<center>70.79/83.85|
+| ArabicT5-Large       |  <center>00:23:50  |<center>71.22/84.42|
+| ArabicT5-xLarge      |  <center>00:52:17  |<center>72.86/86.00|
+Please note that we can further speed up our ArabicT5-Base by increasing the batch size since it could handle larger batch size than other base-scale models due to its hidden layer size (512).
 # Paper