Ibrahemqasim commited on
Commit
ccbe58c
1 Parent(s): c21d4d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -1,3 +1,24 @@
1
  ---
2
  license: unknown
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: unknown
3
  ---
4
+ # enwiki_to_arwiki_categories Dataset
5
+
6
+ This dataset contains mappings between English Wikipedia categories and their corresponding Arabic Wikipedia categories.
7
+
8
+ ## Files
9
+
10
+ ### langlinks.json
11
+
12
+ This file contains the original mappings as downloaded from the Hugging Face Hub. It contains 231349 mappings.
13
+
14
+ ### filtered_data.json
15
+
16
+ This file contains the mappings after filtering out those that do not contain a 4-digit year. It contains 231349 mappings.
17
+
18
+ ### cats_2000.json
19
+
20
+ This file contains the mappings after replacing all 4-digit years with the year 2000. It contains 20913 mappings.
21
+
22
+ ## Usage
23
+
24
+ To use this dataset, you can download it from the Hugging Face Hub and load it into your code. For example, you can use the following code to load the filtered data into a Python dictionary: