--- license: apache-2.0 base_model: allenai/longformer-base-4096 tags: - generated_from_trainer datasets: - essays_su_g metrics: - accuracy model-index: - name: longformer-simple results: - task: name: Token Classification type: token-classification dataset: name: essays_su_g type: essays_su_g config: simple split: test args: simple metrics: - name: Accuracy type: accuracy value: 0.836576014905586 --- # longformer-simple This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset. It achieves the following results on the evaluation set: - Loss: 0.4521 - Claim: {'precision': 0.5926012643409038, 'recall': 0.5952492944496708, 'f1-score': 0.5939223278188431, 'support': 4252.0} - Majorclaim: {'precision': 0.746797608881298, 'recall': 0.8015582034830431, 'f1-score': 0.773209549071618, 'support': 2182.0} - O: {'precision': 0.9330482727579611, 'recall': 0.8940161725067386, 'f1-score': 0.9131152956722828, 'support': 9275.0} - Premise: {'precision': 0.8684019663147715, 'recall': 0.8832786885245901, 'f1-score': 0.8757771546995001, 'support': 12200.0} - Accuracy: 0.8366 - Macro avg: {'precision': 0.7852122780737336, 'recall': 0.7935255897410106, 'f1-score': 0.789006081815561, 'support': 27909.0} - Weighted avg: {'precision': 0.8383596573659686, 'recall': 0.836576014905586, 'f1-score': 0.837225505344309, 'support': 27909.0} ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 6 ### Training results | Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg | |:-------------:|:-----:|:----:|:---------------:|:---------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:| | No log | 1.0 | 41 | 0.5884 | {'precision': 0.49868766404199477, 'recall': 0.22342427093132644, 'f1-score': 0.30859184667857725, 'support': 4252.0} | {'precision': 0.6308376575240919, 'recall': 0.3900091659028414, 'f1-score': 0.4820164259416595, 'support': 2182.0} | {'precision': 0.817888247382327, 'recall': 0.9011320754716982, 'f1-score': 0.8574946137273007, 'support': 9275.0} | {'precision': 0.7873372125242449, 'recall': 0.9316393442622951, 'f1-score': 0.8534314461630875, 'support': 12200.0} | 0.7713 | {'precision': 0.6836876953681646, 'recall': 0.6115512141420403, 'f1-score': 0.6253835831276563, 'support': 27909.0} | {'precision': 0.7412782687839407, 'recall': 0.7712565838976674, 'f1-score': 0.7427359833384354, 'support': 27909.0} | | No log | 2.0 | 82 | 0.4638 | {'precision': 0.5763888888888888, 'recall': 0.5075258701787394, 'f1-score': 0.5397698849424711, 'support': 4252.0} | {'precision': 0.6741528762805359, 'recall': 0.7841429880843263, 'f1-score': 0.725, 'support': 2182.0} | {'precision': 0.9210763341589732, 'recall': 0.8820485175202156, 'f1-score': 0.9011400561766811, 'support': 9275.0} | {'precision': 0.8506865437426442, 'recall': 0.8886885245901639, 'f1-score': 0.8692723992784124, 'support': 12200.0} | 0.8202 | {'precision': 0.7555761607677606, 'recall': 0.7656014750933613, 'f1-score': 0.7587955850993913, 'support': 27909.0} | {'precision': 0.8184874400582042, 'recall': 0.8202371994697051, 'f1-score': 0.8183829174463697, 'support': 27909.0} | | No log | 3.0 | 123 | 0.4497 | {'precision': 0.6111299626739056, 'recall': 0.4235653809971778, 'f1-score': 0.5003472704542298, 'support': 4252.0} | {'precision': 0.7032967032967034, 'recall': 0.8212648945921174, 'f1-score': 0.7577167019027485, 'support': 2182.0} | {'precision': 0.9438293905139261, 'recall': 0.8732075471698113, 'f1-score': 0.9071460573476703, 'support': 9275.0} | {'precision': 0.8196342080532061, 'recall': 0.9293442622950819, 'f1-score': 0.8710482848692045, 'support': 12200.0} | 0.8252 | {'precision': 0.7694725661344353, 'recall': 0.7618455212635471, 'f1-score': 0.7590645786434633, 'support': 27909.0} | {'precision': 0.8200463271041109, 'recall': 0.8251818409831954, 'f1-score': 0.8177069473942856, 'support': 27909.0} | | No log | 4.0 | 164 | 0.4504 | {'precision': 0.5816213828142257, 'recall': 0.6192380056444027, 'f1-score': 0.5998405285340016, 'support': 4252.0} | {'precision': 0.6949866054343666, 'recall': 0.8322639780018332, 'f1-score': 0.7574556830031283, 'support': 2182.0} | {'precision': 0.9409930715935335, 'recall': 0.8785983827493261, 'f1-score': 0.908725954836911, 'support': 9275.0} | {'precision': 0.8776116937814848, 'recall': 0.8710655737704918, 'f1-score': 0.8743263811756962, 'support': 12200.0} | 0.8322 | {'precision': 0.7738031884059027, 'recall': 0.8002914850415135, 'f1-score': 0.7850871368874343, 'support': 27909.0} | {'precision': 0.8393023145203343, 'recall': 0.8321688344261707, 'f1-score': 0.8348025837219264, 'support': 27909.0} | | No log | 5.0 | 205 | 0.4540 | {'precision': 0.5803511891531451, 'recall': 0.6140639698965192, 'f1-score': 0.5967318020797622, 'support': 4252.0} | {'precision': 0.7292703150912107, 'recall': 0.806141154903758, 'f1-score': 0.7657814540705268, 'support': 2182.0} | {'precision': 0.9338842975206612, 'recall': 0.8893800539083558, 'f1-score': 0.9110890214269937, 'support': 9275.0} | {'precision': 0.8739005343197699, 'recall': 0.8713934426229508, 'f1-score': 0.8726451877693413, 'support': 12200.0} | 0.8331 | {'precision': 0.7793515840211966, 'recall': 0.795244655332896, 'f1-score': 0.7865618663366559, 'support': 27909.0} | {'precision': 0.8378044523993523, 'recall': 0.8330646028162958, 'f1-score': 0.8350303027606281, 'support': 27909.0} | | No log | 6.0 | 246 | 0.4521 | {'precision': 0.5926012643409038, 'recall': 0.5952492944496708, 'f1-score': 0.5939223278188431, 'support': 4252.0} | {'precision': 0.746797608881298, 'recall': 0.8015582034830431, 'f1-score': 0.773209549071618, 'support': 2182.0} | {'precision': 0.9330482727579611, 'recall': 0.8940161725067386, 'f1-score': 0.9131152956722828, 'support': 9275.0} | {'precision': 0.8684019663147715, 'recall': 0.8832786885245901, 'f1-score': 0.8757771546995001, 'support': 12200.0} | 0.8366 | {'precision': 0.7852122780737336, 'recall': 0.7935255897410106, 'f1-score': 0.789006081815561, 'support': 27909.0} | {'precision': 0.8383596573659686, 'recall': 0.836576014905586, 'f1-score': 0.837225505344309, 'support': 27909.0} | ### Framework versions - Transformers 4.37.2 - Pytorch 2.2.0+cu121 - Datasets 2.17.0 - Tokenizers 0.15.2