BioBERT V1.1 model update

Browse files

Files changed (8) hide show

README.md +136 -0
config.json +62 -62
optimizer.pt +1 -1
pytorch_model.bin +1 -1
rng_state.pth +2 -2
scheduler.pt +1 -1
trainer_state.json +48 -84
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,3 +1,139 @@
 ---
 license: apache-2.0
 ---

 ---
+language: fr
+datasets:
+- FrenchMedMCQA
 license: apache-2.0
+model-index:
+- name: qanastek/FrenchMedMCQA-BioBERT-V1.1-Wikipedia-BM25
+  results:
+  - task:
+      type: question-answering
+      name: Question Answering
+    dataset:
+      name: FrenchMedMCQA
+      type: FrenchMedMCQA
+      config: FrenchMedMCQA
+      split: validation
+    metrics:
+    - name: Exact Match
+      type: exact_match
+      value: 16.72
+      verified: true
+    - name: Hamming Score
+      type: hamming score
+      value: 38.72
+      verified: true
+widget:
+- text: "Parmi les affirmations suivantes, une seule est fausse, indiquer laquelle: les particules alpha \n (A) Sont formées de noyaux d'hélium (B) Sont peu pénétrantes (C) Toute l'énergie qu'elles transportent est cédée au long d'un parcours de quelques centimètres dans l'air (D) Sont arrêtées par une feuille de papier (E) Sont peu ionisantes\n Acception particulière"
 ---
+# FrenchMedMCQA : Multiple-choice question answering on pharmacology exams using BioBERT V1.1, Wikipedia external knowledge and BM25 retriever
+- Corpora: [FrenchMedMCQA](https://github.com/qanastek/FrenchMedMCQA)
+- Model: [BioBERT V1.1](https://huggingface.co/dmis-lab/biobert-v1.1)
+- Number of Epochs: 10
+**People Involved**
+* [Yanis LABRAK](https://www.linkedin.com/in/yanis-labrak-8a7412145/) (1)
+* [Adrien BAZOGE](https://fr.linkedin.com/in/adrien-bazoge-6b511b145) (2)
+* [Richard DUFOUR](https://cv.archives-ouvertes.fr/richard-dufour) (2)
+* [Béatrice DAILLE](https://scholar.google.com/citations?user=-damXYEAAAAJ&hl=fr) (2)
+* [Pierre-Antoine GOURRAUD](https://fr.linkedin.com/in/pierre-antoine-gourraud-35779b6) (3)
+* [Emmanuel MORIN](https://scholar.google.fr/citations?user=tvTEtM0AAAAJ&hl=fr) (2)
+* [Mickael ROUVIER](https://scholar.google.fr/citations?user=0fmu-VsAAAAJ&hl=fr) (1)
+**Affiliations**
+1. [LIA, NLP team](https://lia.univ-avignon.fr/), Avignon University, Avignon, France.
+2. [LS2N, TALN team](https://www.ls2n.fr/equipe/taln/), Nantes University, Nantes, France.
+3. [CHU Nantes](https://www.chu-nantes.fr/), Nantes University, Nantes, France.
+## Demo: How to use in HuggingFace Transformers
+Requires [Transformers](https://pypi.org/project/transformers/): ```pip install transformers```
+```python
+from datasets import load_dataset
+from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
+path_model = "qanastek/FrenchMedMCQA-BioBERT-V1.1-Wikipedia-BM25"
+tokenizer = AutoTokenizer.from_pretrained(path_model)
+model = AutoModelForSequenceClassification.from_pretrained(path_model)
+pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=False, device=0) # GPU
+dataset  = load_dataset("qanastek/FrenchMedMCQA")["test"]
+for e in dataset:
+    prediction = pipeline(e["bert_text"], truncation=True, max_length=model.config.max_position_embeddings)
+```
+Output:
+![Preview Output](preview.PNG)
+## Training data
+The questions and their associated candidate answer(s) were collected from real French pharmacy exams on the remede website. Questions and answers were manually created by medical experts and used during  examinations. The dataset is composed of 2,025 questions with multiple answers and 1,080 with a single one, for a total of 3,105 questions. Each instance of the dataset contains an identifier, a question, five options (labeled from A to E) and correct answer(s). The average question length is 14.17 tokens and the average answer length is 6.44 tokens. The vocabulary size is of 13k words, of which 3.8k are estimated medical domain-specific words (i.e. a word related to the medical field). We find an average of 2.49 medical domain-specific words in each question (17 % of the words) and 2 in each answer (36 % of the words). On average, a medical domain-specific word is present in 2 questions and in 8 answers.
+| # Answers | Training | Validation | Test | Total |
+|:---------:|:--------:|:----------:|:----:|:-----:|
+|     1     |    595   |     164    |  321 | 1,080 |
+|     2     |    528   |     45     |  97  |  670  |
+|     3     |    718   |     71     |  141 |  930  |
+|     4     |    296   |     30     |  56  |  382  |
+|     5     |    34    |      2     |   7  |   43  |
+|   Total   |   2171   |     312    |  622 | 3,105 |
+## Evaluation results
+The test corpora used for this evaluation is available on [Github](https://github.com/qanastek/FrenchMedMCQA).
+|   Architecture   | Hamming |  EMR  | Hamming |  EMR  | Hamming |  EMR  | Hamming |  EMR  | Hamming |  EMR  |
+|:----------------:|:-------:|:-----:|:-------:|:-----:|:-------:|:-----:|:-------:|:-----:|:-------:|:-----:|
+|   BioBERT V1.1   |  36.19  | 15.43 |  **38.72**  | 16.72 |  33.33  | 14.14 |  35.13  | 16.23 |  34.27  | 13.98 |
+|    PubMedBERT    |  33.98  | 14.14 |  34.00  | 13.98 |  35.66  | 15.59 |  33.87  | 14.79 |  35.44  | 14.79 |
+|  CamemBERT-base  |  36.24  | 16.55 |  34.19  | 14.46 |  34.78  | 15.43 |  34.66  | 14.79 |  34.61  | 14.95 |
+| XLM-RoBERTa-base |  37.92  | 17.20 |  31.26  | 11.89 |  35.84  | 16.07 |  32.47  | 14.63 |  33.00  | 14.95 |
+|     BART-base    |  31.93  | 15.91 |  34.98  | **18.64** |  33.80  | 17.68 |  29.65  | 12.86 |  34.65  | 18.32 |
+## BibTeX Citations
+Please cite the following paper when using this model.
+FrenchMedMCQA corpus and linked tools:
+```latex
+@unpublished{labrak:hal-03824241,
+  TITLE = {{FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain}},
+  AUTHOR = {Labrak, Yanis and Bazoge, Adrien and Dufour, Richard and Daille, B{\'e}atrice and Gourraud, Pierre-Antoine and Morin, Emmanuel and Rouvier, Mickael},
+  URL = {https://hal.archives-ouvertes.fr/hal-03824241},
+  NOTE = {working paper or preprint},
+  YEAR = {2022},
+  MONTH = Oct,
+  PDF = {https://hal.archives-ouvertes.fr/hal-03824241/file/LOUHI_2022___QA-3.pdf},
+  HAL_ID = {hal-03824241},
+  HAL_VERSION = {v1},
+}
+```
+HuggingFace's Transformers :
+```latex
+@misc{https://doi.org/10.48550/arxiv.1910.03771,
+    doi = {10.48550/ARXIV.1910.03771},
+    url = {https://arxiv.org/abs/1910.03771},
+    author = {Wolf, Thomas and Debut, Lysandre and Sanh, Victor and Chaumond, Julien and Delangue, Clement and Moi, Anthony and Cistac, Pierric and Rault, Tim and Louf, Rémi and Funtowicz, Morgan and Davison, Joe and Shleifer, Sam and von Platen, Patrick and Ma, Clara and Jernite, Yacine and Plu, Julien and Xu, Canwen and Scao, Teven Le and Gugger, Sylvain and Drame, Mariama and Lhoest, Quentin and Rush, Alexander M.},
+    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
+    title = {HuggingFace's Transformers: State-of-the-art Natural Language Processing},
+    publisher = {arXiv},
+    year = {2019},
+    copyright = {arXiv.org perpetual, non-exclusive license}
+}
+```
+## Acknowledgment
+This work was financially supported by [Zenidoc](https://zenidoc.fr/), the [DIETS](https://anr-diets.univ-avignon.fr/) project financed by the Agence Nationale de la Recherche (ANR) under contract ANR-20-CE23-0005 and the ANR [AIBy4](https://aiby4.ls2n.fr/) (ANR-20-THIA-0011).

config.json CHANGED Viewed

@@ -10,72 +10,72 @@
   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "id2label": {
-    "0": "LABEL_0",
-    "1": "LABEL_1",
-    "2": "LABEL_2",
-    "3": "LABEL_3",
-    "4": "LABEL_4",
-    "5": "LABEL_5",
-    "6": "LABEL_6",
-    "7": "LABEL_7",
-    "8": "LABEL_8",
-    "9": "LABEL_9",
-    "10": "LABEL_10",
-    "11": "LABEL_11",
-    "12": "LABEL_12",
-    "13": "LABEL_13",
-    "14": "LABEL_14",
-    "15": "LABEL_15",
-    "16": "LABEL_16",
-    "17": "LABEL_17",
-    "18": "LABEL_18",
-    "19": "LABEL_19",
-    "20": "LABEL_20",
-    "21": "LABEL_21",
-    "22": "LABEL_22",
-    "23": "LABEL_23",
-    "24": "LABEL_24",
-    "25": "LABEL_25",
-    "26": "LABEL_26",
-    "27": "LABEL_27",
-    "28": "LABEL_28",
-    "29": "LABEL_29",
-    "30": "LABEL_30"
   },
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
-    "LABEL_0": 0,
-    "LABEL_1": 1,
-    "LABEL_10": 10,
-    "LABEL_11": 11,
-    "LABEL_12": 12,
-    "LABEL_13": 13,
-    "LABEL_14": 14,
-    "LABEL_15": 15,
-    "LABEL_16": 16,
-    "LABEL_17": 17,
-    "LABEL_18": 18,
-    "LABEL_19": 19,
-    "LABEL_2": 2,
-    "LABEL_20": 20,
-    "LABEL_21": 21,
-    "LABEL_22": 22,
-    "LABEL_23": 23,
-    "LABEL_24": 24,
-    "LABEL_25": 25,
-    "LABEL_26": 26,
-    "LABEL_27": 27,
-    "LABEL_28": 28,
-    "LABEL_29": 29,
-    "LABEL_3": 3,
-    "LABEL_30": 30,
-    "LABEL_4": 4,
-    "LABEL_5": 5,
-    "LABEL_6": 6,
-    "LABEL_7": 7,
-    "LABEL_8": 8,
-    "LABEL_9": 9
   },
   "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,

   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "id2label": {
+    "0": "c",
+    "1": "a",
+    "2": "e",
+    "3": "d",
+    "4": "b",
+    "5": "be",
+    "6": "ae",
+    "7": "bc",
+    "8": "bd",
+    "9": "ab",
+    "10": "de",
+    "11": "cd",
+    "12": "ac",
+    "13": "ad",
+    "14": "ce",
+    "15": "bce",
+    "16": "abc",
+    "17": "cde",
+    "18": "bcd",
+    "19": "ace",
+    "20": "ade",
+    "21": "abe",
+    "22": "acd",
+    "23": "bde",
+    "24": "abd",
+    "25": "abde",
+    "26": "abcd",
+    "27": "bcde",
+    "28": "abce",
+    "29": "acde",
+    "30": "abcde"
   },
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
+    "c": 0,
+    "a": 1,
+    "e": 10,
+    "d": 11,
+    "b": 12,
+    "be": 13,
+    "ae": 14,
+    "bc": 15,
+    "bd": 16,
+    "ab": 17,
+    "de": 18,
+    "cd": 19,
+    "ac": 2,
+    "ad": 20,
+    "ce": 21,
+    "bce": 22,
+    "abc": 23,
+    "cde": 24,
+    "bcd": 25,
+    "ace": 26,
+    "ade": 27,
+    "abe": 28,
+    "acd": 29,
+    "bde": 3,
+    "abd": 30,
+    "abde": 4,
+    "abcd": 5,
+    "bcde": 6,
+    "abce": 7,
+    "acde": 8,
+    "abcde": 9
   },
   "layer_norm_eps": 1e-12,
   "max_position_embeddings": 512,

optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c59e476f258026c1224572d39dd2749b94a080f47f527d5cf41713cffaddc727
 size 866789277

 version https://git-lfs.github.com/spec/v1
+oid sha256:233cf2771eb01d2e8c73fcc050e67de43b98b97246fd7b17a452cd8b633619aa
 size 866789277

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2a1d14e1a7ae3a933e64a02529d84ca875185eb6eedbe2da6f776a2cd2563eb9
 size 433407405

 version https://git-lfs.github.com/spec/v1
+oid sha256:ba1c8d0fe5c7d27a9fc6cf9470f544dd35775450cd60e42b157b06e200e97e8b
 size 433407405

rng_state.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:eac35a75470483733ef2559b641f153185f391fe23c7947bd33323cbccf35cc5
-size 17563

 version https://git-lfs.github.com/spec/v1
+oid sha256:ee20a722b64f81302a306cfea71de1f48196838ef4a0215073c03138fafe0f41
+size 14503

scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:703982142fcedd14e5792d976b42adce5e182c36b7f363756bb8375901668f30
 size 623

 version https://git-lfs.github.com/spec/v1
+oid sha256:229af39592678ac24e5cdc1157fbb6b2c3e0d09815db0c4f615dd862058567fd
 size 623

trainer_state.json CHANGED Viewed

@@ -1,106 +1,70 @@
 {
-  "best_metric": 0.16666666666666666,
-  "best_model_checkpoint": "biobert-v1.1-finetuned-frenchmedmcqa/checkpoint-952",
-  "epoch": 7.0,
-  "global_step": 952,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
-      "epoch": 1.0,
-      "eval_accuracy": 0.13782051282051283,
-      "eval_f1": 0.03338750451426508,
-      "eval_loss": 3.1616480350494385,
-      "eval_precision": 0.01899449375410914,
-      "eval_recall": 0.13782051282051283,
-      "eval_runtime": 5.2204,
-      "eval_samples_per_second": 59.765,
-      "eval_steps_per_second": 3.831,
-      "step": 136
-    },
-    {
-      "epoch": 2.0,
-      "eval_accuracy": 0.16025641025641027,
-      "eval_f1": 0.06942300892072582,
-      "eval_loss": 2.8957390785217285,
-      "eval_precision": 0.04520268867434859,
-      "eval_recall": 0.16025641025641027,
-      "eval_runtime": 5.1606,
-      "eval_samples_per_second": 60.458,
-      "eval_steps_per_second": 3.875,
-      "step": 272
-    },
-    {
-      "epoch": 3.0,
-      "eval_accuracy": 0.15384615384615385,
-      "eval_f1": 0.0817681987234755,
-      "eval_loss": 2.722668170928955,
-      "eval_precision": 0.10610228539576365,
-      "eval_recall": 0.15384615384615385,
-      "eval_runtime": 5.1585,
-      "eval_samples_per_second": 60.483,
-      "eval_steps_per_second": 3.877,
-      "step": 408
     },
     {
-      "epoch": 3.68,
-      "learning_rate": 1.2647058823529412e-05,
-      "loss": 3.2368,
-      "step": 500
     },
     {
-      "epoch": 4.0,
-      "eval_accuracy": 0.15705128205128205,
-      "eval_f1": 0.06610402141868449,
-      "eval_loss": 2.653123617172241,
-      "eval_precision": 0.061101876500489406,
-      "eval_recall": 0.15705128205128205,
-      "eval_runtime": 5.1556,
-      "eval_samples_per_second": 60.516,
-      "eval_steps_per_second": 3.879,
-      "step": 544
     },
     {
-      "epoch": 5.0,
-      "eval_accuracy": 0.15384615384615385,
-      "eval_f1": 0.08432282082065398,
-      "eval_loss": 2.648888349533081,
-      "eval_precision": 0.08665532400061121,
-      "eval_recall": 0.15384615384615385,
-      "eval_runtime": 5.1699,
-      "eval_samples_per_second": 60.349,
-      "eval_steps_per_second": 3.869,
-      "step": 680
     },
     {
-      "epoch": 6.0,
-      "eval_accuracy": 0.13782051282051283,
-      "eval_f1": 0.08504238712449276,
-      "eval_loss": 2.684257745742798,
-      "eval_precision": 0.08323137866407097,
-      "eval_recall": 0.13782051282051283,
-      "eval_runtime": 5.1788,
-      "eval_samples_per_second": 60.246,
-      "eval_steps_per_second": 3.862,
-      "step": 816
     },
     {
-      "epoch": 7.0,
-      "eval_accuracy": 0.16666666666666666,
-      "eval_f1": 0.11330544110085618,
-      "eval_loss": 2.5864272117614746,
-      "eval_precision": 0.12433231913455035,
-      "eval_recall": 0.16666666666666666,
-      "eval_runtime": 5.1662,
-      "eval_samples_per_second": 60.393,
-      "eval_steps_per_second": 3.871,
-      "step": 952
     }
   ],
-  "max_steps": 1360,
   "num_train_epochs": 10,
-  "total_flos": 1979672148207138.0,
   "trial_name": null,
   "trial_params": null
 }

 {
+  "best_metric": 0.18269230769230768,
+  "best_model_checkpoint": "biobert-v1.1-finetuned-frenchmedmcqa/checkpoint-1629",
+  "epoch": 3.0,
+  "global_step": 1629,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
   "log_history": [
     {
+      "epoch": 0.92,
+      "learning_rate": 1.815837937384899e-05,
+      "loss": 3.4112,
+      "step": 500
     },
     {
+      "epoch": 1.0,
+      "eval_accuracy": 0.14423076923076922,
+      "eval_f1": 0.04993107948113005,
+      "eval_loss": 2.976611614227295,
+      "eval_precision": 0.030627588098176332,
+      "eval_recall": 0.14423076923076922,
+      "eval_runtime": 1.4297,
+      "eval_samples_per_second": 218.234,
+      "eval_steps_per_second": 54.559,
+      "step": 543
     },
     {
+      "epoch": 1.84,
+      "learning_rate": 1.6316758747697976e-05,
+      "loss": 3.1756,
+      "step": 1000
     },
     {
+      "epoch": 2.0,
+      "eval_accuracy": 0.1794871794871795,
+      "eval_f1": 0.0727026201206936,
+      "eval_loss": 2.687720775604248,
+      "eval_precision": 0.04664191343365827,
+      "eval_recall": 0.1794871794871795,
+      "eval_runtime": 1.3011,
+      "eval_samples_per_second": 239.802,
+      "eval_steps_per_second": 59.95,
+      "step": 1086
     },
     {
+      "epoch": 2.76,
+      "learning_rate": 1.4475138121546963e-05,
+      "loss": 3.0228,
+      "step": 1500
     },
     {
+      "epoch": 3.0,
+      "eval_accuracy": 0.18269230769230768,
+      "eval_f1": 0.11557929420998879,
+      "eval_loss": 2.6114799976348877,
+      "eval_precision": 0.10303806777379766,
+      "eval_recall": 0.18269230769230768,
+      "eval_runtime": 1.0933,
+      "eval_samples_per_second": 285.38,
+      "eval_steps_per_second": 71.345,
+      "step": 1629
     }
   ],
+  "max_steps": 5430,
   "num_train_epochs": 10,
+  "total_flos": 664497557825640.0,
   "trial_name": null,
   "trial_params": null
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aba052851e122ec4dd03692c037212e78abde9529a4aa1aad00013df9555ad07
 size 3119

 version https://git-lfs.github.com/spec/v1
+oid sha256:5839cab219119544ad1ab8c8dec230e90c1e0fcc001fc00325b44b92acc9ba71
 size 3119