Ibrahemqasim's picture
Update README.md
0154aba verified
|
raw
history blame
598 Bytes
---
license: unknown
---
# enwiki_to_arwiki_categories Dataset
This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.
## Files
### langlinks.json
This file contains the original mappings as downloaded from database. It contains 231349 mappings.
### filtered_data.json
This file contains the mappings after filtering out those that do not contain a 4-digit year. It contains 231349 mappings.
### cats_2000.json
This file contains the mappings after replacing all 4-digit years with the year 2000. It contains 20913 mappings.