|
--- |
|
license: apache-2.0 |
|
base_model: allenai/longformer-base-4096 |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- essays_su_g |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: longformer-simple |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: essays_su_g |
|
type: essays_su_g |
|
config: simple |
|
split: test |
|
args: simple |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.835142785481386 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# longformer-simple |
|
|
|
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.4315 |
|
- Claim: {'precision': 0.5943734015345269, 'recall': 0.5465663217309501, 'f1-score': 0.5694682675814751, 'support': 4252.0} |
|
- Majorclaim: {'precision': 0.7267513314215486, 'recall': 0.8130155820348305, 'f1-score': 0.7674670127622755, 'support': 2182.0} |
|
- O: {'precision': 0.934245960502693, 'recall': 0.8976819407008086, 'f1-score': 0.9155990542695331, 'support': 9275.0} |
|
- Premise: {'precision': 0.8606674047129527, 'recall': 0.8921311475409837, 'f1-score': 0.876116879980681, 'support': 12200.0} |
|
- Accuracy: 0.8351 |
|
- Macro avg: {'precision': 0.7790095245429304, 'recall': 0.7873487480018933, 'f1-score': 0.7821628036484911, 'support': 27909.0} |
|
- Weighted avg: {'precision': 0.8340793553924228, 'recall': 0.835142785481386, 'f1-score': 0.8340248400056594, 'support': 27909.0} |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 5 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |
|
|:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| |
|
| No log | 1.0 | 41 | 0.5743 | {'precision': 0.5082508250825083, 'recall': 0.2535277516462841, 'f1-score': 0.33830221245881065, 'support': 4252.0} | {'precision': 0.5805350028457599, 'recall': 0.4674610449129239, 'f1-score': 0.5178979436405179, 'support': 2182.0} | {'precision': 0.8466549477820288, 'recall': 0.8828032345013477, 'f1-score': 0.8643513142615855, 'support': 9275.0} | {'precision': 0.7886490250696379, 'recall': 0.9282786885245902, 'f1-score': 0.8527861445783133, 'support': 12200.0} | 0.7743 | {'precision': 0.6810224501949838, 'recall': 0.6330176798962865, 'f1-score': 0.6433344037348069, 'support': 27909.0} | {'precision': 0.7489359214227731, 'recall': 0.7743380271596976, 'f1-score': 0.7520643421129422, 'support': 27909.0} | |
|
| No log | 2.0 | 82 | 0.4563 | {'precision': 0.5752391997680487, 'recall': 0.4666039510818438, 'f1-score': 0.5152577587326321, 'support': 4252.0} | {'precision': 0.7043734230445753, 'recall': 0.7676443629697525, 'f1-score': 0.7346491228070176, 'support': 2182.0} | {'precision': 0.9195569478630566, 'recall': 0.8861455525606469, 'f1-score': 0.9025421402295064, 'support': 9275.0} | {'precision': 0.8371119902617163, 'recall': 0.9018852459016393, 'f1-score': 0.868292297979798, 'support': 12200.0} | 0.8198 | {'precision': 0.7590703902343492, 'recall': 0.7555697781284707, 'f1-score': 0.7551853299372385, 'support': 27909.0} | {'precision': 0.8142361553305312, 'recall': 0.8198430613780501, 'f1-score': 0.8154403512156749, 'support': 27909.0} | |
|
| No log | 3.0 | 123 | 0.4417 | {'precision': 0.6114437791084497, 'recall': 0.43226716839134527, 'f1-score': 0.5064756131165611, 'support': 4252.0} | {'precision': 0.6908951798010712, 'recall': 0.8276810265811182, 'f1-score': 0.7531276063386154, 'support': 2182.0} | {'precision': 0.9402591445935099, 'recall': 0.8840970350404312, 'f1-score': 0.9113136252500555, 'support': 9275.0} | {'precision': 0.827903891509434, 'recall': 0.9207377049180328, 'f1-score': 0.8718565662837628, 'support': 12200.0} | 0.8269 | {'precision': 0.7676254987531161, 'recall': 0.7661957337327319, 'f1-score': 0.7606933527472487, 'support': 27909.0} | {'precision': 0.821553021377153, 'recall': 0.8268658855566304, 'f1-score': 0.8200201629172901, 'support': 27909.0} | |
|
| No log | 4.0 | 164 | 0.4382 | {'precision': 0.5850725952813067, 'recall': 0.6065380997177798, 'f1-score': 0.5956120092378753, 'support': 4252.0} | {'precision': 0.6956022944550669, 'recall': 0.8336388634280477, 'f1-score': 0.7583906608296852, 'support': 2182.0} | {'precision': 0.9404094704334897, 'recall': 0.8864690026954178, 'f1-score': 0.9126429126429128, 'support': 9275.0} | {'precision': 0.8778720250349996, 'recall': 0.8737704918032787, 'f1-score': 0.8758164564761943, 'support': 12200.0} | 0.8341 | {'precision': 0.7747390963012157, 'recall': 0.8001041144111309, 'f1-score': 0.7856155097966668, 'support': 27909.0} | {'precision': 0.8397961025237265, 'recall': 0.8341395248844459, 'f1-score': 0.8361845450923504, 'support': 27909.0} | |
|
| No log | 5.0 | 205 | 0.4315 | {'precision': 0.5943734015345269, 'recall': 0.5465663217309501, 'f1-score': 0.5694682675814751, 'support': 4252.0} | {'precision': 0.7267513314215486, 'recall': 0.8130155820348305, 'f1-score': 0.7674670127622755, 'support': 2182.0} | {'precision': 0.934245960502693, 'recall': 0.8976819407008086, 'f1-score': 0.9155990542695331, 'support': 9275.0} | {'precision': 0.8606674047129527, 'recall': 0.8921311475409837, 'f1-score': 0.876116879980681, 'support': 12200.0} | 0.8351 | {'precision': 0.7790095245429304, 'recall': 0.7873487480018933, 'f1-score': 0.7821628036484911, 'support': 27909.0} | {'precision': 0.8340793553924228, 'recall': 0.835142785481386, 'f1-score': 0.8340248400056594, 'support': 27909.0} | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.37.2 |
|
- Pytorch 2.2.0+cu121 |
|
- Datasets 2.17.0 |
|
- Tokenizers 0.15.2 |
|
|