AMR-KELEG commited on
Commit
a025de7
1 Parent(s): f25ec6a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -3
README.md CHANGED
@@ -18,8 +18,7 @@ widget:
18
 
19
  <!-- Provide a quick summary of what the model is/does. -->
20
 
21
- A BERT-based model fine-tuned to perform single-label Arabic Dialect Identification (ADI).
22
-
23
  ### Model Description
24
 
25
  <!-- Provide a longer summary of what this model is. -->
@@ -30,4 +29,37 @@ A BERT-based model fine-tuned to perform single-label Arabic Dialect Identificat
30
  <!--- **License:** [More Information Needed] -->
31
  - **Finetuned from model :** [MarBERT](https://huggingface.co/UBC-NLP/MARBERT)
32
 
33
- More information coming soon!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  <!-- Provide a quick summary of what the model is/does. -->
20
 
21
+ A BERT-based model fine-tuned to perform single-label Arabic Dialect Identification (ADI). The model was used in the following paper: [Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification](https://aclanthology.org/2023.arabicnlp-1.31/)
 
22
  ### Model Description
23
 
24
  <!-- Provide a longer summary of what this model is. -->
 
29
  <!--- **License:** [More Information Needed] -->
30
  - **Finetuned from model :** [MarBERT](https://huggingface.co/UBC-NLP/MARBERT)
31
 
32
+
33
+ ### Citation
34
+
35
+ If you find the model useful, please cite the following [respective paper](https://aclanthology.org/2023.arabicnlp-1.31/):
36
+ ```
37
+ @inproceedings{keleg-magdy-2023-arabic,
38
+ title = "{A}rabic Dialect Identification under Scrutiny: Limitations of Single-label Classification",
39
+ author = "Keleg, Amr and
40
+ Magdy, Walid",
41
+ editor = "Sawaf, Hassan and
42
+ El-Beltagy, Samhaa and
43
+ Zaghouani, Wajdi and
44
+ Magdy, Walid and
45
+ Abdelali, Ahmed and
46
+ Tomeh, Nadi and
47
+ Abu Farha, Ibrahim and
48
+ Habash, Nizar and
49
+ Khalifa, Salam and
50
+ Keleg, Amr and
51
+ Haddad, Hatem and
52
+ Zitouni, Imed and
53
+ Mrini, Khalil and
54
+ Almatham, Rawan",
55
+ booktitle = "Proceedings of ArabicNLP 2023",
56
+ month = dec,
57
+ year = "2023",
58
+ address = "Singapore (Hybrid)",
59
+ publisher = "Association for Computational Linguistics",
60
+ url = "https://aclanthology.org/2023.arabicnlp-1.31",
61
+ doi = "10.18653/v1/2023.arabicnlp-1.31",
62
+ pages = "385--398",
63
+ abstract = "Automatic Arabic Dialect Identification (ADI) of text has gained great popularity since it was introduced in the early 2010s. Multiple datasets were developed, and yearly shared tasks have been running since 2018. However, ADI systems are reported to fail in distinguishing between the micro-dialects of Arabic. We argue that the currently adopted framing of the ADI task as a single-label classification problem is one of the main reasons for that. We highlight the limitation of the incompleteness of the Dialect labels and demonstrate how it impacts the evaluation of ADI systems. A manual error analysis for the predictions of an ADI, performed by 7 native speakers of different Arabic dialects, revealed that $\approx$ 67{\%} of the validated errors are not true errors. Consequently, we propose framing ADI as a multi-label classification task and give recommendations for designing new ADI datasets.",
64
+ }
65
+ ```