language: en | |
license: mit | |
... | |
# Page Filtering | |
Model to identify pages in children's books of the long 19th century (ca. 1789-1914) | |
that contain illustrations. It is used to filter non-relevant pages without | |
illustrations trained on hand-coded data. | |
Results on our validation dataset: | |
| | f1score | precision | recall | accuracy | | |
|:---------------|----------:|------------:|---------:|:-----------| | |
| not-relevant | 99.63 | 100 | 99.26 | - | | |
| relevant-cover | 85.71 | 75 | 100 | - | | |
| relevant-page | 100 | 100 | 100 | - | | |
| Macro Avg. | 95.11 | 91.67 | 99.75 | 99.37 | | |
Dataset: | |
| | data | train | test | | |
|:---------------|-------:|--------:|-------:| | |
| not-relevant | 902 | 631 | 271 | | |
| relevant-cover | 20 | 14 | 6 | | |
| relevant-page | 136 | 95 | 41 | |