longformer-simple / README.md
Theoreticallyhugo's picture
trainer: training complete at 2024-03-02 13:28:39.634997.
df024a4 verified
|
raw
history blame
16.1 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-simple
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: simple
split: train[40%:60%]
args: simple
metrics:
- name: Accuracy
type: accuracy
value: 0.844576254146979
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-simple
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6960
- Claim: {'precision': 0.6125356125356125, 'recall': 0.6133421110379635, 'f1-score': 0.6129385964912281, 'support': 4557.0}
- Majorclaim: {'precision': 0.8268884892086331, 'recall': 0.8104892022917585, 'f1-score': 0.818606721566882, 'support': 2269.0}
- O: {'precision': 0.8947368421052632, 'recall': 0.9040207522697795, 'f1-score': 0.8993548387096775, 'support': 8481.0}
- Premise: {'precision': 0.890714532274767, 'recall': 0.8877115728636301, 'f1-score': 0.8892105172473207, 'support': 14534.0}
- Accuracy: 0.8446
- Macro avg: {'precision': 0.806218869031069, 'recall': 0.8038909096157829, 'f1-score': 0.805027668503777, 'support': 29841.0}
- Weighted avg: {'precision': 0.8445240755442303, 'recall': 0.844576254146979, 'f1-score': 0.84453583593764, 'support': 29841.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
### Training results
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.5869 | {'precision': 0.535017852238396, 'recall': 0.4274742154926487, 'f1-score': 0.47523786289338865, 'support': 4557.0} | {'precision': 0.5474121647147714, 'recall': 0.6386073159982371, 'f1-score': 0.5895036615134255, 'support': 2269.0} | {'precision': 0.845403060609712, 'recall': 0.8272609362103526, 'f1-score': 0.8362336114421931, 'support': 8481.0} | {'precision': 0.850072112232857, 'recall': 0.8921838447777625, 'f1-score': 0.8706190412246543, 'support': 14534.0} | 0.7835 | {'precision': 0.694476297448934, 'recall': 0.6963815781197502, 'f1-score': 0.6928985442684154, 'support': 29841.0} | {'precision': 0.7776202536983177, 'recall': 0.7834858081163499, 'f1-score': 0.7790930985214805, 'support': 29841.0} |
| No log | 2.0 | 82 | 0.4861 | {'precision': 0.6276500447894894, 'recall': 0.4612683783190696, 'f1-score': 0.5317480394636985, 'support': 4557.0} | {'precision': 0.6855268552685527, 'recall': 0.7368884971353019, 'f1-score': 0.7102803738317758, 'support': 2269.0} | {'precision': 0.872397977184523, 'recall': 0.8746610069567268, 'f1-score': 0.8735280263777673, 'support': 8481.0} | {'precision': 0.8556913183279743, 'recall': 0.9155084629145452, 'f1-score': 0.884589815184151, 'support': 14534.0} | 0.8210 | {'precision': 0.7603165488926349, 'recall': 0.7470815863314109, 'f1-score': 0.7500365637143481, 'support': 29841.0} | {'precision': 0.8126767385071133, 'recall': 0.820951040514728, 'f1-score': 0.8143098940939201, 'support': 29841.0} |
| No log | 3.0 | 123 | 0.4651 | {'precision': 0.5519389190275267, 'recall': 0.6028088654816766, 'f1-score': 0.5762534088525278, 'support': 4557.0} | {'precision': 0.6811542572141076, 'recall': 0.8426619656236227, 'f1-score': 0.7533490937746257, 'support': 2269.0} | {'precision': 0.9045047256658892, 'recall': 0.868883386393114, 'f1-score': 0.8863363002165023, 'support': 8481.0} | {'precision': 0.8910855499640546, 'recall': 0.8528278519333975, 'f1-score': 0.8715370552664885, 'support': 14534.0} | 0.8184 | {'precision': 0.7571708629678946, 'recall': 0.7917955173579527, 'f1-score': 0.7718689645275361, 'support': 29841.0} | {'precision': 0.8271460951435015, 'recall': 0.8184377199155525, 'f1-score': 0.8216639389194362, 'support': 29841.0} |
| No log | 4.0 | 164 | 0.4685 | {'precision': 0.5727291118753793, 'recall': 0.6212420452051789, 'f1-score': 0.5960000000000001, 'support': 4557.0} | {'precision': 0.7450166112956811, 'recall': 0.7906566769501984, 'f1-score': 0.7671584348941629, 'support': 2269.0} | {'precision': 0.8872651356993737, 'recall': 0.9020162716660771, 'f1-score': 0.8945798982634625, 'support': 8481.0} | {'precision': 0.8981107585809057, 'recall': 0.8569561029310582, 'f1-score': 0.8770509119076122, 'support': 14534.0} | 0.8287 | {'precision': 0.7757804043628349, 'recall': 0.792717774188128, 'f1-score': 0.7836973112663095, 'support': 29841.0} | {'precision': 0.8336988249364055, 'recall': 0.8287255789015113, 'f1-score': 0.8307578351802057, 'support': 29841.0} |
| No log | 5.0 | 205 | 0.4714 | {'precision': 0.6111239326102008, 'recall': 0.5810840465218345, 'f1-score': 0.5957255343082115, 'support': 4557.0} | {'precision': 0.7859340659340659, 'recall': 0.7880123402379903, 'f1-score': 0.7869718309859155, 'support': 2269.0} | {'precision': 0.8915166490175315, 'recall': 0.8934087961325315, 'f1-score': 0.8924617196702003, 'support': 8481.0} | {'precision': 0.8798018189222208, 'recall': 0.8919086280445852, 'f1-score': 0.8858138581385815, 'support': 14534.0} | 0.8370 | {'precision': 0.7920941166210047, 'recall': 0.7886034527342354, 'f1-score': 0.7902432357757272, 'support': 29841.0} | {'precision': 0.8349642603479214, 'recall': 0.8369692704668074, 'f1-score': 0.8358884354766487, 'support': 29841.0} |
| No log | 6.0 | 246 | 0.5037 | {'precision': 0.5850368809272919, 'recall': 0.6091727013385999, 'f1-score': 0.5968608901311546, 'support': 4557.0} | {'precision': 0.8594594594594595, 'recall': 0.7708241516086382, 'f1-score': 0.8127323420074349, 'support': 2269.0} | {'precision': 0.8999051233396584, 'recall': 0.8947058129937507, 'f1-score': 0.8972979364985514, 'support': 8481.0} | {'precision': 0.8796910246770114, 'recall': 0.8854410348149168, 'f1-score': 0.8825566642663649, 'support': 14534.0} | 0.8372 | {'precision': 0.8060231221008552, 'recall': 0.7900359251889764, 'f1-score': 0.7973619582258764, 'support': 29841.0} | {'precision': 0.8389012192486348, 'recall': 0.8371703361147415, 'f1-score': 0.8378086229762441, 'support': 29841.0} |
| No log | 7.0 | 287 | 0.5330 | {'precision': 0.6130337078651685, 'recall': 0.5986394557823129, 'f1-score': 0.6057510824913955, 'support': 4557.0} | {'precision': 0.8630921395106715, 'recall': 0.7307183781401498, 'f1-score': 0.7914081145584726, 'support': 2269.0} | {'precision': 0.8824737562756733, 'recall': 0.9119207640608419, 'f1-score': 0.8969556393157437, 'support': 8481.0} | {'precision': 0.880592955256358, 'recall': 0.8910141736617586, 'f1-score': 0.8857729138166894, 'support': 14534.0} | 0.8401 | {'precision': 0.8097981397269679, 'recall': 0.7830731929112658, 'f1-score': 0.7949719375455753, 'support': 29841.0} | {'precision': 0.8389379916879856, 'recall': 0.8401192989511075, 'f1-score': 0.8390140076168712, 'support': 29841.0} |
| No log | 8.0 | 328 | 0.5759 | {'precision': 0.599912453490917, 'recall': 0.6014922097871407, 'f1-score': 0.6007012930089853, 'support': 4557.0} | {'precision': 0.8645575877409788, 'recall': 0.7708241516086382, 'f1-score': 0.8150046598322461, 'support': 2269.0} | {'precision': 0.9148230088495575, 'recall': 0.8776087725504068, 'f1-score': 0.8958295721249322, 'support': 8481.0} | {'precision': 0.8694501422616291, 'recall': 0.9040869684876841, 'f1-score': 0.8864303302189092, 'support': 14534.0} | 0.8402 | {'precision': 0.8121857980857706, 'recall': 0.7885030256084674, 'f1-score': 0.7994914637962682, 'support': 29841.0} | {'precision': 0.8408124567818104, 'recall': 0.8402198317750745, 'f1-score': 0.8400372100799064, 'support': 29841.0} |
| No log | 9.0 | 369 | 0.5976 | {'precision': 0.6026747195858498, 'recall': 0.6131226684222076, 'f1-score': 0.6078538018057218, 'support': 4557.0} | {'precision': 0.8060156931124673, 'recall': 0.8148964301454386, 'f1-score': 0.8104317335086566, 'support': 2269.0} | {'precision': 0.9120731707317074, 'recall': 0.8818535550053059, 'f1-score': 0.8967088304058509, 'support': 8481.0} | {'precision': 0.8804975868397797, 'recall': 0.8912205862116417, 'f1-score': 0.8858266370319713, 'support': 14534.0} | 0.8403 | {'precision': 0.800315292567451, 'recall': 0.8002733099461484, 'f1-score': 0.8002052506880502, 'support': 29841.0} | {'precision': 0.8413820848138425, 'recall': 0.8402868536577193, 'f1-score': 0.8407376197665799, 'support': 29841.0} |
| No log | 10.0 | 410 | 0.6327 | {'precision': 0.6153846153846154, 'recall': 0.6179504059688391, 'f1-score': 0.6166648417825469, 'support': 4557.0} | {'precision': 0.7641955835962145, 'recall': 0.8541207580431909, 'f1-score': 0.8066597294484912, 'support': 2269.0} | {'precision': 0.9150908869098451, 'recall': 0.8844475887277443, 'f1-score': 0.8995083343326538, 'support': 8481.0} | {'precision': 0.8894852738783374, 'recall': 0.8893628732626944, 'f1-score': 0.8894240693593889, 'support': 14534.0} | 0.8438 | {'precision': 0.7960390899422531, 'recall': 0.8114704065006171, 'f1-score': 0.8030642437307701, 'support': 29841.0} | {'precision': 0.8453782465037248, 'recall': 0.8438390134378875, 'f1-score': 0.8443440976397001, 'support': 29841.0} |
| No log | 11.0 | 451 | 0.6347 | {'precision': 0.5944913550462404, 'recall': 0.6488918147904323, 'f1-score': 0.6205015213513796, 'support': 4557.0} | {'precision': 0.78500823723229, 'recall': 0.8400176289114147, 'f1-score': 0.8115818607621886, 'support': 2269.0} | {'precision': 0.9032919329555047, 'recall': 0.8832684824902723, 'f1-score': 0.8931679980922858, 'support': 8481.0} | {'precision': 0.8962250812950657, 'recall': 0.872299435805697, 'f1-score': 0.8841004184100418, 'support': 14534.0} | 0.8388 | {'precision': 0.7947541516322751, 'recall': 0.8111193404994541, 'f1-score': 0.802337949653974, 'support': 29841.0} | {'precision': 0.843699440707882, 'recall': 0.8388458831808585, 'f1-score': 0.8409094181783406, 'support': 29841.0} |
| No log | 12.0 | 492 | 0.6513 | {'precision': 0.6110076557003932, 'recall': 0.6480140443274084, 'f1-score': 0.6289669861554845, 'support': 4557.0} | {'precision': 0.803946803946804, 'recall': 0.8259144997796386, 'f1-score': 0.8147826086956521, 'support': 2269.0} | {'precision': 0.901905099988167, 'recall': 0.8987147742011555, 'f1-score': 0.9003071107961257, 'support': 8481.0} | {'precision': 0.8980036552790664, 'recall': 0.8789734415852484, 'f1-score': 0.8883866481223922, 'support': 14534.0} | 0.8453 | {'precision': 0.8037158037286076, 'recall': 0.8129041899733627, 'f1-score': 0.8081108384424137, 'support': 29841.0} | {'precision': 0.8481337577161484, 'recall': 0.8452799839147481, 'f1-score': 0.8465621274593268, 'support': 29841.0} |
| 0.2641 | 13.0 | 533 | 0.6643 | {'precision': 0.6193424423569599, 'recall': 0.6366030283080975, 'f1-score': 0.6278541283410888, 'support': 4557.0} | {'precision': 0.8395522388059702, 'recall': 0.7933010136624064, 'f1-score': 0.8157715839564923, 'support': 2269.0} | {'precision': 0.8938955172014363, 'recall': 0.9099162834571395, 'f1-score': 0.9018347551712048, 'support': 8481.0} | {'precision': 0.8938108484005564, 'recall': 0.8843401678822073, 'f1-score': 0.8890502870581725, 'support': 14534.0} | 0.8469 | {'precision': 0.8116502616912307, 'recall': 0.8060401233274627, 'f1-score': 0.8086276886317396, 'support': 29841.0} | {'precision': 0.8477953919677784, 'recall': 0.8468549981568982, 'f1-score': 0.8472247718762136, 'support': 29841.0} |
| 0.2641 | 14.0 | 574 | 0.6926 | {'precision': 0.5876997774630791, 'recall': 0.6374807987711214, 'f1-score': 0.6115789473684211, 'support': 4557.0} | {'precision': 0.8265213442325159, 'recall': 0.8021154693697664, 'f1-score': 0.814135540147618, 'support': 2269.0} | {'precision': 0.8866728153101222, 'recall': 0.9068506072397123, 'f1-score': 0.8966482075196736, 'support': 8481.0} | {'precision': 0.8985879332477535, 'recall': 0.8669327095087381, 'f1-score': 0.8824765373301583, 'support': 14534.0} | 0.8383 | {'precision': 0.7998704675633677, 'recall': 0.8033448962223346, 'f1-score': 0.8012098080914677, 'support': 29841.0} | {'precision': 0.8422463719188642, 'recall': 0.8383097081197011, 'f1-score': 0.8399392193721293, 'support': 29841.0} |
| 0.2641 | 15.0 | 615 | 0.6816 | {'precision': 0.594445578925873, 'recall': 0.6387974544656573, 'f1-score': 0.6158239898455681, 'support': 4557.0} | {'precision': 0.8150388936905791, 'recall': 0.8312031732040547, 'f1-score': 0.8230416757582371, 'support': 2269.0} | {'precision': 0.9040445973194164, 'recall': 0.8987147742011555, 'f1-score': 0.901371807000946, 'support': 8481.0} | {'precision': 0.8959785900415522, 'recall': 0.8753268198706481, 'f1-score': 0.885532314760032, 'support': 14534.0} | 0.8425 | {'precision': 0.8023769149943552, 'recall': 0.8110105554353789, 'f1-score': 0.8064424468411957, 'support': 29841.0} | {'precision': 0.8460697299178652, 'recall': 0.8424985757849938, 'f1-score': 0.8440954539700084, 'support': 29841.0} |
| 0.2641 | 16.0 | 656 | 0.6960 | {'precision': 0.6125356125356125, 'recall': 0.6133421110379635, 'f1-score': 0.6129385964912281, 'support': 4557.0} | {'precision': 0.8268884892086331, 'recall': 0.8104892022917585, 'f1-score': 0.818606721566882, 'support': 2269.0} | {'precision': 0.8947368421052632, 'recall': 0.9040207522697795, 'f1-score': 0.8993548387096775, 'support': 8481.0} | {'precision': 0.890714532274767, 'recall': 0.8877115728636301, 'f1-score': 0.8892105172473207, 'support': 14534.0} | 0.8446 | {'precision': 0.806218869031069, 'recall': 0.8038909096157829, 'f1-score': 0.805027668503777, 'support': 29841.0} | {'precision': 0.8445240755442303, 'recall': 0.844576254146979, 'f1-score': 0.84453583593764, 'support': 29841.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2