omarmomen commited on
Commit
0a98f57
1 Parent(s): 4f9be56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -17,4 +17,6 @@ The paper titled "Increasing The Performance of Cognitively Inspired Data-Effici
17
 
18
  This model variant places the parser network after 4 attention blocks and increases the number of convolution layers in the parser network from 4 to 6.
19
 
20
- The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
 
 
 
17
 
18
  This model variant places the parser network after 4 attention blocks and increases the number of convolution layers in the parser network from 4 to 6.
19
 
20
+ The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
21
+
22
+ https://arxiv.org/abs/2310.20589