Model Card for medical-ner-koelectra

Model Summary

This model is a fine-tuned version of the monologg/koelectra-base-v3-discriminator.

We fine-tuned the model using the KBMC and Naver X Changwon Univ NER dataset datasets.

Model Details

Model Description

  • Developed by: Sungjoo Byun (Grace Byun)
  • Language(s) (NLP): Korean
  • License: Apache 2.0
  • Finetuned from model: monologg/koelectra-base-v3-discriminator

Training Data

The model was trained using the dataset Naver X Changwon Univ NER dataset and Korean Bio-Medical Corpus (KBMC).

Model Performance

Overall Metrics

  • F1 Score: 0.8886
  • Loss: 0.2949
  • Precision: 0.8844
  • Recall: 0.8928

Class-wise Performance

Class Precision Recall F1-Score Support
AFW 0.6676 0.6326 0.6496 362
ANM 0.7476 0.7800 0.7635 600
Body 0.9731 0.9813 0.9772 1068
CVL 0.8492 0.8579 0.8536 4977
DAT 0.9078 0.9286 0.9181 2130
Disease 0.9738 0.9872 0.9805 2109
EVT 0.7332 0.7446 0.7389 1026
FLD 0.6138 0.6170 0.6154 188
LOC 0.8721 0.8691 0.8706 1734
MAT 0.5385 0.5000 0.5185 14
NUM 0.9227 0.9305 0.9266 4660
ORG 0.8917 0.8866 0.8892 3307
PER 0.8918 0.9049 0.8983 3626
PLT 0.2941 0.2174 0.2500 23
TIM 0.8644 0.9173 0.8901 278
Treatment 0.9468 0.9852 0.9656 271

Averages

Metric Micro Avg Macro Avg Weighted Avg
Precision 0.8844 0.7930 0.8841
Recall 0.8928 0.7963 0.8928
F1-Score 0.8886 0.7941 0.8884

Citations

Please cite our KBMC paper:

@misc{byun2024korean,
      title={Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition}, 
      author={Sungjoo Byun and Jiseung Hong and Sumin Park and Dongjun Jang and Jean Seo and Minseok Kim and Chaeyoung Oh and Hyopil Shin},
      year={2024},
      eprint={2403.16158},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Model Card Contact

For any questions or issues, please contact byunsj@snu.ac.kr.

Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train SungJoo/medical-ner-koelectra