jeniakim commited on
Commit
cf46e33
1 Parent(s): d54d934

update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -13
README.md CHANGED
@@ -10,16 +10,19 @@ inference: false
10
  # Description
11
  A fine-tuned multi-class classification model that detects four different types of uncertainty cues (a.k.a hedges) on a token level.
12
 
13
- ### Uncertainty types:
14
- - **E**pistemic: the proposition is possible, but its truth-value cannot be decided at the moment. Example: *She **may** be already asleep.*
15
- - **I**nvestigation: the proposition is in the process of having its truth-value determined. Example: *She **examined** the role of NF-kappaB in protein activation.*
16
- - **D**oxatic: the proposition expresses beliefs and hypotheses, which may be known as true or false by others. Example: *She **believes** that the Earth is flat*
17
- - Co**N**dition: the proposition is true or false based on the truth-value of another proposition. Example: ***If** she gets the job, she will move to Utrecht.*
 
 
 
18
 
19
  # Intended uses and limitations
20
  - The model was fine-tuned with the [Simple Transformers](https://simpletransformers.ai/) library. This library is based on Transformers but the model cannot be used directly with Transformers `pipeline` and classes; doing so would generate incorrect outputs. For this reason, the API on this page is disabled.
21
 
22
- ## How to use
23
  To generate predictions with the model, use the [Simple Transformers](https://simpletransformers.ai/) library:
24
  ```
25
  from simpletransformers.ner import NERModel
@@ -58,13 +61,7 @@ In other words, the token 'perhaps' is recognized as an **epistemic uncertainty
58
  # Training Data
59
  HEDGEhog is trained and evaluated on the [Szeged Uncertainty Corpus](https://rgai.inf.u-szeged.hu/node/160) (Szarvas et al. 2012<sup>1</sup>). The original sentence-level XML version of this dataset is available [here](https://rgai.inf.u-szeged.hu/node/160).
60
 
61
- The token-level version that was used for the training can be downloaded from [here](https://1drv.ms/u/s!AvPkt_QxBozXk7BiazucDqZkVxLo6g?e=IisuM6) in a form of pickled pandas DataFrame's. You can download either the split sets (```train.pkl``` 137MB, ```test.pkl``` 17MB, ```dev.pkl``` 17MB) or the full dataset (```szeged_fixed.pkl``` 172MB). Each row in the df contains a token, its features (these are not relevant for HEDGEhog; they were used to train the baseline CRF model, see [here](https://github.com/vanboefer/uncertainty_crf)), its sentence ID, and its label. The labels refer to different types of semantic uncertainty (Szarvas et al. 2012):
62
-
63
- - E: epistemic
64
- - I: investigation
65
- - D: doxatic
66
- - N: condition
67
- - C: the token is **not** an uncertainty cue
68
 
69
  # Training Procedure
70
  The following training parameters were used:
 
10
  # Description
11
  A fine-tuned multi-class classification model that detects four different types of uncertainty cues (a.k.a hedges) on a token level.
12
 
13
+ # Uncertainty types
14
+ label | type | description | example
15
+ ---| ---| ---| ---
16
+ E | Epistemic | The proposition is possible, but its truth-value cannot be decided at the moment. | She **may** be already asleep.
17
+ I | Investigation | The proposition is in the process of having its truth-value determined. | She **examined** the role of NF-kappaB in protein activation.
18
+ D | Doxatic | The proposition expresses beliefs and hypotheses, which may be known as true or false by others. | She **believes** that the Earth is flat.
19
+ N | Condition | The proposition is true or false based on the truth-value of another proposition. | **If** she gets the job, she will move to Utrecht.
20
+ C | *certain* | *n/a* | *n/a*
21
 
22
  # Intended uses and limitations
23
  - The model was fine-tuned with the [Simple Transformers](https://simpletransformers.ai/) library. This library is based on Transformers but the model cannot be used directly with Transformers `pipeline` and classes; doing so would generate incorrect outputs. For this reason, the API on this page is disabled.
24
 
25
+ # How to use
26
  To generate predictions with the model, use the [Simple Transformers](https://simpletransformers.ai/) library:
27
  ```
28
  from simpletransformers.ner import NERModel
 
61
  # Training Data
62
  HEDGEhog is trained and evaluated on the [Szeged Uncertainty Corpus](https://rgai.inf.u-szeged.hu/node/160) (Szarvas et al. 2012<sup>1</sup>). The original sentence-level XML version of this dataset is available [here](https://rgai.inf.u-szeged.hu/node/160).
63
 
64
+ The token-level version that was used for the training can be downloaded from [here](https://1drv.ms/u/s!AvPkt_QxBozXk7BiazucDqZkVxLo6g?e=IisuM6) in a form of pickled pandas DataFrame's. You can download either the split sets (```train.pkl``` 137MB, ```test.pkl``` 17MB, ```dev.pkl``` 17MB) or the full dataset (```szeged_fixed.pkl``` 172MB). Each row in the df contains a token, its features (these are not relevant for HEDGEhog; they were used to train the baseline CRF model, see [here](https://github.com/vanboefer/uncertainty_crf)), its sentence ID, and its label.
 
 
 
 
 
 
65
 
66
  # Training Procedure
67
  The following training parameters were used: