Ibrahemqasim
commited on
Commit
•
ccbe58c
1
Parent(s):
c21d4d9
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,24 @@
|
|
1 |
---
|
2 |
license: unknown
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: unknown
|
3 |
---
|
4 |
+
# enwiki_to_arwiki_categories Dataset
|
5 |
+
|
6 |
+
This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.
|
7 |
+
|
8 |
+
## Files
|
9 |
+
|
10 |
+
### langlinks.json
|
11 |
+
|
12 |
+
This file contains the original mappings as downloaded from the Hugging Face Hub. It contains 231349 mappings.
|
13 |
+
|
14 |
+
### filtered_data.json
|
15 |
+
|
16 |
+
This file contains the mappings after filtering out those that do not contain a 4-digit year. It contains 231349 mappings.
|
17 |
+
|
18 |
+
### cats_2000.json
|
19 |
+
|
20 |
+
This file contains the mappings after replacing all 4-digit years with the year 2000. It contains 20913 mappings.
|
21 |
+
|
22 |
+
## Usage
|
23 |
+
|
24 |
+
To use this dataset, you can download it from the Hugging Face Hub and load it into your code. For example, you can use the following code to load the filtered data into a Python dictionary:
|