File size: 802 Bytes
f3c2012 ccbe58c 8bf81f5 ccbe58c 8bf81f5 ccbe58c 8bf81f5 f2fca4b 8bf81f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
---
license: unknown
---
# enwiki_to_arwiki_categories Dataset
This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.
## Files
### langlinks.json
This file contains the original mappings as downloaded from the Hugging Face Hub. It contains 818,354 mappings.
### filtered_data.json
This file contains the mappings after filtering out those that do not contain a 4-digit year. It contains 231,349 mappings.
### cats_2000.json
This file contains the mappings after replacing all 4-digit years with the year 2000. It contains 20,913 mappings.
### cats_2000_contry.json
This file contains the mappings after replacing all 4-digit years with the year 2000 and replacing country names with `country` word. It contains 538 mappings.
|