File size: 960 Bytes
f8f301f
9d4749d
 
f8f301f
9d4749d
e1b7b87
 
 
 
 
 
 
 
 
fd6b821
e1b7b87
f12fac8
 
 
 
26fc788
 
 
 
 
 
f12fac8
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
language: en
license: mit
---
...
# Page Filtering


Model to identify pages in children's books of the long 19th century (ca. 1789-1914)
 that contain illustrations. It is used to filter non-relevant pages without 
 illustrations trained on hand-coded data.

Results on our validation dataset:

|                |   f1score |   precision |   recall | accuracy   |
|:---------------|----------:|------------:|---------:|:-----------|
| not-relevant   |     99.63 |      100    |    99.26 | -          |
| relevant-cover |     85.71 |       75    |   100    | -          |
| relevant-page  |    100    |      100    |   100    | -          |
| Macro Avg.     |     95.11 |       91.67 |    99.75 | 99.37      |


Dataset:

|                |   data |   train |   test |
|:---------------|-------:|--------:|-------:|
| not-relevant   |    902 |     631 |    271 |
| relevant-cover |     20 |      14 |      6 |
| relevant-page  |    136 |      95 |     41 |