NotXia commited on
Commit
f71a65f
1 Parent(s): 04f3f18

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -1,3 +1,37 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - allenai/mslr2022
5
+ language:
6
+ - en
7
+ pipeline_tag: summarization
8
  ---
9
+
10
+ # PubMedBERT for biomedical extractive summarization
11
+
12
+ ## Description
13
+ [PubMedBERT](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) fine-tuned
14
+ on [MS^2](https://github.com/allenai/mslr-shared-task) for extractive summarization.\
15
+ Model architecture is similar to [BERTSum](https://github.com/nlpyang/BertSum).\
16
+ Training code is available at [biomed-ext-summ](https://github.com/NotXia/biomed-ext-summ).
17
+
18
+ ## Usage
19
+ ```python
20
+ summarizer = pipeline("summarization",
21
+ model = "NotXia/pubmedbert-bio-ext-summ",
22
+ tokenizer = AutoTokenizer.from_pretrained("NotXia/pubmedbert-bio-ext-summ"),
23
+ trust_remote_code = True,
24
+ device = 0
25
+ )
26
+
27
+ sentences = ["sent1.", "sent2.", "sent3?"]
28
+ summarizer({"sentences": sentences}, strategy="count", strategy_args=2)
29
+ >>> (['sent1.', 'sent2.'], [0, 1])
30
+ ```
31
+
32
+ ### Strategies
33
+ Strategies to summarize the document:
34
+ - `length`: summary with a maximum length (`strategy_args` is the maximum length).
35
+ - `count`: summary with the given number of sentences (`strategy_args` is the number of sentences).
36
+ - `ratio`: summary proportional to the length of the document (`strategy_args` is the ratio [0, 1]).
37
+ - `threshold`: summary only with sentences with a score higher than a given value (`strategy_args` is the minimum score).