chenxwh commited on
Commit
bc925cd
1 Parent(s): 6a2dc59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -2
README.md CHANGED
@@ -9,6 +9,8 @@ Data, knowledge store and source code to reproduce the baseline experiments for
9
 
10
  ## NEWS:
11
  - 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
 
 
12
 
13
  ## Dataset
14
  The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
@@ -119,13 +121,40 @@ Then evaluate the veracity prediction performance with (see [evaluate_veracity.p
119
  python -m src.prediction.evaluate_veracity
120
  ```
121
 
122
- The result for dev and the test set below. We recommend using 0.25 as cut-off score for evaluating the relevance of the evidence.
123
-
124
  | Model | Split | Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
125
  |-------------------|-------|--------|-------|----------------|--------|-------|
126
  | AVeriTeC-BLOOM-7b | dev | 0.240 | 0.185 | 0.186 | 0.092 | 0.050 |
127
  | AVeriTeC-BLOOM-7b | test | 0.248 | 0.185 | 0.176 | 0.109 | 0.059 |
128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
  ## Citation
130
  If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
131
  ```bibtex
 
9
 
10
  ## NEWS:
11
  - 19.04.2024: The submisstion page (with eval.ai) for the shared-task is alive, you can participate by submitting your predictions [here](https://eval.ai/web/challenges/challenge-page/2285/overview)!
12
+ - 15.07.2024: To facilitate human evaluation we now ask the submission files to include a `scraped_text' field in your submission file, have a look in []() for more information!
13
+
14
 
15
  ## Dataset
16
  The training and dev dataset can be found under [data](https://huggingface.co/chenxwh/AVeriTeC/tree/main/data). Test data will be released at a later date. Each claim follows the following structure:
 
121
  python -m src.prediction.evaluate_veracity
122
  ```
123
 
 
 
124
  | Model | Split | Q only | Q + A | Veracity @ 0.2 | @ 0.25 | @ 0.3 |
125
  |-------------------|-------|--------|-------|----------------|--------|-------|
126
  | AVeriTeC-BLOOM-7b | dev | 0.240 | 0.185 | 0.186 | 0.092 | 0.050 |
127
  | AVeriTeC-BLOOM-7b | test | 0.248 | 0.185 | 0.176 | 0.109 | 0.059 |
128
 
129
+
130
+ ## Format for submission files
131
+
132
+ To facilitate human evaluation, the submission file should include the text of the evidence documents used, retrieved through the `url` field. If external knowledge is utilized, please provide the scraped text. If our provided knowledge store is used, this can be achieved by running the following code block (see [veracity_with_scraped_text.py](https://huggingface.co/chenxwh/AVeriTeC/blob/main/src/prediction/veracity_with_scraped_text.py) for adding the text to the previous prediction file. An example output for the dev set is [here](https://huggingface.co/chenxwh/AVeriTeC/blob/main/data_store/dev_veracity_prediction_for_submission.json).
133
+ ```bash
134
+ python -m src.prediction.veracity_with_scraped_text --knowledge_store_dir <directory of the the knowledge store json files>
135
+ ```
136
+
137
+ Each line of the final submission file is a json object with the following information:
138
+ ```json
139
+ {
140
+ "claim_id": "The ID of the sample.",
141
+ "claim": "The claim text itself.",
142
+ "pred_label": "The predicted label of the claim.",
143
+ "evidence": [
144
+ {
145
+ "question": "The text of the generated question.",
146
+ "answer": "The text of the answer to the generated question.",
147
+ "url": "The source URL for the answer.",
148
+ "scraped_text": "The text scraped from the URL."
149
+ }
150
+ ]
151
+ }
152
+
153
+ ```
154
+
155
+
156
+
157
+
158
  ## Citation
159
  If you find AVeriTeC useful for your research and applications, please cite us using this BibTeX:
160
  ```bibtex