File size: 3,464 Bytes
4d135c5
 
 
 
 
 
 
 
 
 
 
 
 
2764534
4d135c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3157a7b
4d135c5
 
 
 
 
 
 
cf373ab
 
4d135c5
cf373ab
 
4d135c5
cf373ab
 
4d135c5
cf373ab
4d135c5
cf373ab
4d135c5
9d677b7
 
 
 
 
633bb53
9d677b7
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
datasets:
- squad_v2
metrics:
- f1
- exact_match
---
## Distilroberta-squad2

This model is [Distilroberta base](https://huggingface.co/distilroberta-base) which was fine-tuned for context-based question answering on the [SQuAD v2](https://huggingface.co/datasets/squad_v2) dataset, a dataset of English-language context-question-answer triples designed for extractive question answering training and benchmarking. Version 2 of SQuAD (Stanford Question Answering Dataset) contains the 100,000 examples from SQuAD Version 1.1, along with 50,000 additional "unanswerable" questions, i.e. questions whose answer cannot be found in the provided context.

## Model description

This fine-tuned model prioritizes inference speed; DistilRoBERTa operates at a pace twice as fast as the RoBERTa-base model, with only a marginal compromise in quality.

## Intended uses & limitations

```python
from transformers import pipeline
QA_pipeline = pipeline("question-answering", model="AdamCodd/distilroberta-squad2", handle_impossible_answer=True)
input = {
  'question': "Which name is also used to describe the Amazon rainforest in English?",
  'context': '''The Amazon rainforest (Portuguese: Floresta Amaz么nica or Amaz么nia; Spanish: Selva Amaz贸nica, Amazon铆a or usually Amazonia; French: For锚t amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain "Amazonas" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species.'''
}
response = QA_pipeline(**input)
print(response)
```

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- mixed_precision = "fp16"
- max_seq_len = 386
- doc_stride = 128
- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 150
- num_epochs: 3

### Training results
Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).

Results:
```  
'exact': 72.9470226564474, 
'f1': 76.03522762032603, 
'total': 11873, 
'HasAns_exact': 72.4527665317139, 
'HasAns_f1': 78.63803264779528, 
'HasAns_total': 5928, 
'NoAns_exact': 73.43986543313709, 
'NoAns_f1': 73.43986543313709, 
'NoAns_total': 5945, 
'best_exact': 72.95544512760044, 
'best_exact_thresh': 0.0, 
'best_f1': 76.04365009147917, 
'best_f1_thresh': 0.0
```

### Framework versions

- Transformers 4.34.0
- Torch 2.0.1
- Accelerate 0.23.0
- Tokenizers 0.14.1

If you want to support me, you can [here](https://ko-fi.com/adamcodd).