Update README.md
Browse files
README.md
CHANGED
@@ -97,8 +97,8 @@ Masked Language Modeling objective with 15% masked token ratio.
|
|
97 |
|
98 |
### Preprocessing
|
99 |
|
100 |
-
Tokenize `data["train"]["fen"]` with max-length padding to 200 tokens with default `distilbert-base-cased` tokenizer.
|
101 |
-
|
102 |
|
103 |
### Speeds, Sizes, Times
|
104 |
|
|
|
97 |
|
98 |
### Preprocessing
|
99 |
|
100 |
+
Tokenize `data["train"]["fen"]` with max-length padding to 200 tokens with default `distilbert-base-cased` tokenizer.
|
101 |
+
Experiments with reduced max-length in tokenization show performance gains.
|
102 |
|
103 |
### Speeds, Sizes, Times
|
104 |
|