AgaMiko commited on
Commit
6766351
·
1 Parent(s): 0710120

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -25,9 +25,15 @@ metrics:
25
  - recall
26
 
27
  ---
 
 
28
  # Keyword Extraction from Short Texts with T5
29
 
30
- Our vlT5 model is a keyword generation model based on encoder-decoder architecture using Transformer blocks presented by Google ([https://huggingface.co/t5-base](https://huggingface.co/t5-base)). The vlT5 was trained on scientific articles corpus to predict a given set of keyphrases based on the concatenation of the article’s abstract and title. It generates precise, yet not always complete keyphrases that describe the content of the article based only on the abstract.
 
 
 
 
31
 
32
  The biggest advantage is the transferability of the vlT5 model, as it works well on all domains and types of text. The downside is that the text length and the number of keywords are similar to the training data: the text piece of an abstract length generates approximately 3 to 5 keywords. It works both extractive and abstractively. Longer pieces of text must be split into smaller chunks, and then propagated to the model.
33
 
@@ -36,7 +42,7 @@ The biggest advantage is the transferability of the vlT5 model, as it works well
36
  - **Language:** pl, en (but works relatively well with others)
37
  - **Training data:** POSMAC
38
  - **Online Demo:** [https://nlp-demo-1.voicelab.ai/](https://nlp-demo-1.voicelab.ai/)
39
- - **Paper:** [TBA](TBA)
40
 
41
  # Corpus
42
 
 
25
  - recall
26
 
27
  ---
28
+ <img src="https://public.3.basecamp.com/p/rs5XqmAuF1iEuW6U7nMHcZeY/upload/download/VL-NLP-short.png" alt="logo voicelab nlp" style="width:300px;"/>
29
+
30
  # Keyword Extraction from Short Texts with T5
31
 
32
+ > Our vlT5 model is a keyword generation model based on encoder-decoder architecture using Transformer blocks presented by Google ([https://huggingface.co/t5-base](https://huggingface.co/t5-base)). The vlT5 was trained on scientific articles corpus to predict a given set of keyphrases based on the concatenation of the article’s abstract and title. It generates precise, yet not always complete keyphrases that describe the content of the article based only on the abstract.
33
+
34
+ **Keywords generated with vlT5-base-keywords:** encoder-decoder architecture, keyword generation
35
+
36
+ ## vlT5
37
 
38
  The biggest advantage is the transferability of the vlT5 model, as it works well on all domains and types of text. The downside is that the text length and the number of keywords are similar to the training data: the text piece of an abstract length generates approximately 3 to 5 keywords. It works both extractive and abstractively. Longer pieces of text must be split into smaller chunks, and then propagated to the model.
39
 
 
42
  - **Language:** pl, en (but works relatively well with others)
43
  - **Training data:** POSMAC
44
  - **Online Demo:** [https://nlp-demo-1.voicelab.ai/](https://nlp-demo-1.voicelab.ai/)
45
+ - **Paper:** [Keyword Extraction from Short Texts with a Text-To-Text Transfer Transformer, ACIIDS 2022](TBA)
46
 
47
  # Corpus
48