BlinkDL commited on
Commit
a91be67
1 Parent(s): f720a95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -7
README.md CHANGED
@@ -20,16 +20,11 @@ RWKV-4 14B is a L40-D5120 causal language model trained on the Pile. See https:/
20
 
21
  Use https://github.com/BlinkDL/ChatRWKV to run it.
22
 
23
- n_layer = 40
24
- n_embd = 5120
25
-
26
  RWKV-4-Pile-14B-2023xxxx-ctx4096-testxxx.pth : Fine-tuned to ctx_len 4096.
27
- * ctx_len = 4096
28
  * Highly recommended. It's great.
29
 
30
- RWKV-4-Pile-14B-20230213-8019.pth : Trained on the Pile for 331B tokens.
31
- * ctx_len = 1024
32
- * Pile loss 1.7579
33
  * LAMBADA ppl 3.81, acc 71.05%
34
  * PIQA acc 77.42%
35
  * SC2016 acc 75.57%
 
20
 
21
  Use https://github.com/BlinkDL/ChatRWKV to run it.
22
 
 
 
 
23
  RWKV-4-Pile-14B-2023xxxx-ctx4096-testxxx.pth : Fine-tuned to ctx_len 4096.
 
24
  * Highly recommended. It's great.
25
 
26
+ RWKV-4-Pile-14B-20230213-8019.pth : Trained on the Pile for 331B tokens
27
+ * Pile loss 1.7579 (ctx_len 1024)
 
28
  * LAMBADA ppl 3.81, acc 71.05%
29
  * PIQA acc 77.42%
30
  * SC2016 acc 75.57%