MaRiOrOsSi commited on
Commit
2c815b9
1 Parent(s): d6e56cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md CHANGED
@@ -9,3 +9,53 @@ widget:
9
  tags:
10
  - Generative Question Answering
11
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  tags:
10
  - Generative Question Answering
11
  ---
12
+
13
+ # T5 for Generative Question Answering
14
+
15
+ This model is the result produced by Christian Di Maio and Giacomo Nunziati for the Language Processing Technologies exam.
16
+ Reference for [Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) fine-tuned on [DuoRC](https://huggingface.co/datasets/duorc) for **Generative Question Answering** by just prepending the *question* to the *context*.
17
+
18
+ ## Code
19
+ The code used for T5 training is available at this [repository](https://github.com/nunziati/bert-vs-t5-for-question-answering/blob/main/train_t5_selfrc.py).
20
+
21
+ ## Results
22
+ The results are evaluated on:
23
+
24
+ - DuoRC/SelfRC -> Test Subset
25
+ - DuoRC/ParaphraseRC -> Test Subset
26
+ - SQUADv1 -> Validation Subset
27
+
28
+ Removing all tokens not related to dictionary words from the evaluation metrics.
29
+ The model used as reference is BERT finetuned on SQUAD v1.
30
+
31
+ | Model | SelfRC | ParaphraseRC | SQUAD
32
+ |--|--|--|--|
33
+ | T5-BASE-FINETUNED | **F1**: 49.00 **EM**: 31.38 | **F1**: 28.75 **EM**: 15.18 | **F1**: 63.28 **EM**: 37.24 |
34
+ | BERT-BASE-FINETUNED | **F1**: 47.18 **EM**: 30.76 | **F1**: 21.20 **EM**: 12.62 | **F1**: 77.19 **EM**: 57.81 |
35
+
36
+ ## How to use it 🚀
37
+
38
+ ```python
39
+ from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline
40
+
41
+ model_name = "MaRiOrOsSi/t5-base-finetuned-question-answering"
42
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
43
+ model = AutoModelWithLMHead.from_pretrained(model_name)
44
+ question = "What is 42?"
45
+ context = "42 is the answer to life, the universe and everything"
46
+ input = f"question: {question} context: {context}"
47
+ encoded_input = tokenizer([input],
48
+ return_tensors='pt',
49
+ max_length=512,
50
+ truncation=True)
51
+ output = model.generate(input_ids = encoded_input.input_ids,
52
+ attention_mask = encoded_input.attention_mask)
53
+ output = tokenizer.decode(output[0], skip_special_tokens=True)
54
+ print(output)
55
+ ```
56
+
57
+ ## Citation
58
+
59
+ Created by [Christian Di Maio](https://it.linkedin.com/in/christiandimaio) and [Giacomo Nunziati](https://it.linkedin.com/in/giacomo-nunziati-b19572185)
60
+
61
+ > Made with <span style="color: #e25555;">&hearts;</span> in Italy