Spaces:
Sleeping
Sleeping
update docs
Browse files
docs/classifier_model.md
CHANGED
@@ -2,7 +2,9 @@
|
|
2 |
|
3 |
**Table of Contents**
|
4 |
1. Objectives
|
5 |
-
2.
|
|
|
|
|
6 |
3. Conclusion
|
7 |
|
8 |
## 1. Objectives
|
@@ -35,8 +37,28 @@ Some countries don't have a lot of observations, which means that it might be ha
|
|
35 |
![distribution of train labels](images/train_labels.png)
|
36 |
![distribution of test labels](images/test_labels.png)
|
37 |
|
38 |
-
## 2.
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
We get the following results:
|
42 |
|
|
|
2 |
|
3 |
**Table of Contents**
|
4 |
1. Objectives
|
5 |
+
2. Iterations
|
6 |
+
- Release v0.0.2
|
7 |
+
- Release v0.0.1
|
8 |
3. Conclusion
|
9 |
|
10 |
## 1. Objectives
|
|
|
37 |
![distribution of train labels](images/train_labels.png)
|
38 |
![distribution of test labels](images/test_labels.png)
|
39 |
|
40 |
+
## 2. Iterations
|
41 |
+
### Release v0.0.2
|
42 |
+
In the second release, I finetuned the langage model `https://huggingface.co/moussaKam/AraBART` by attaching to it a classification head and freezing the weights of the base model (due to compute constraints):
|
43 |
+
```
|
44 |
+
(classification_head): MBartClassificationHead(
|
45 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
46 |
+
(dropout): Dropout(p=0.0, inplace=False)
|
47 |
+
(out_proj): Linear(in_features=768, out_features=21, bias=True)
|
48 |
+
)
|
49 |
+
```
|
50 |
+
The model was trained for 35 epochs, with the following optimizer:
|
51 |
+
`optimizer = optim.Adam(model.parameters(), lr=0.0001)`
|
52 |
+
|
53 |
+
It achieved the lowest loss on validation data at the 25th epoch, which is the checkpoint that was kept.
|
54 |
+
We can probably achieve better results by training a model with more capacity for more epochs.
|
55 |
+
|
56 |
+
![training history](images/training_history_v002.png)
|
57 |
+
|
58 |
+
**Accuracy achieved on test set: 0.3466**
|
59 |
+
|
60 |
+
### Release v0.0.1
|
61 |
+
For the first release, we will convert the tweets into vector embeddings using the AraBART model. We will extract those embeddings from the output of the last hidden layer of the AraBART model. After that, we will train a multinomial logistic regression using these embeddings as features.
|
62 |
|
63 |
We get the following results:
|
64 |
|
docs/images/training_history_v002.png
ADDED