Edit model card

KEPTlongfomer is a medical knowledge enhanced version of Longformer that was further pre-trained using contrastive learning.

Pre-training

We initialized this model from RoBERTa-base-PM-M3-Voc-distill from Facebook bio-lm.

And then pretrained with Hierarchical Self-Alignment Pretrain (HSAP) using Knowledge Graph UMLS. This includes (a) Hierarchy, (b) Synonym, (c) Abbreviation. For more info, see section 3.3 in paper. The learning rate was 5e-5, weight decay was 0.01, adam epsilon was 1e-5.

Usage

Try the following sentence with Fill-Mask task on the right. The sentence masks token "cardiac".

74F with HTN, HLD, DM2, newly diagnosed atrial fibrillation in October who was transferred to hospital for <mask> catheterization after presentation there with syncopal episode.

Or load the model directly from Transformers:

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("whaleloops/KEPTlongformer-PMM3")
config = AutoConfig.from_pretrained("whaleloops/KEPTlongformer-PMM3")
model = AutoModelForMaskedLM.from_pretrained("whaleloops/KEPTlongformer-PMM3", config=config)

See our github for how to use this with prompts on auto ICD coding.

With the following result:

Metric Score
rec_micro =0.5844294992252652
rec_macro =0.12471916602840005
rec_at_8 =0.4138093882408751
rec_at_75 =0.8581874197033126
rec_at_50 =0.8109877644497351
rec_at_5 =0.2923155353947738
rec_at_15 =0.586890060777621
prec_micro =0.6537291416981642
prec_macro =0.1382069689951297
prec_at_8 =0.7835112692763938
prec_at_75 =0.20033214709371291
prec_at_50 =0.2810260972716489
prec_at_5 =0.8551008303677343
prec_at_15 =0.6288256227758008
f1_micro =0.6171399726721254
f1_macro =0.13111711325953157
f1_at_8 =0.54158310388029
f1_at_75 =0.324835806140454
f1_at_50 =0.4174099512237087
f1_at_5 =0.4356905906241822
f1_at_15 =0.6071345676658747
auc_micro =0.9653561390964384
auc_macro =0.8572490224880879
acc_micro =0.4462779749767132
acc_macro =0.09732882850157536
Downloads last month
34
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.