UsernameJustAnother commited on
Commit
d4d67c1
1 Parent(s): 3e6f1b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -29,6 +29,8 @@ New for v6:
29
  - Different learning rate and back to Celeste's scaling factor setup (but Celeste trained on -base, this is -instruct).
30
  - Now with added eval! I worked out how to get eval stats (and wandb) set up, so now I can see my failures in graphical form.
31
 
 
 
32
  And of course yay Unsloth for letting this all train on a single A100 with variable (wildly variable) context length.
33
 
34
  Here's what the train/eval loss looked like (eval is orange, train is blue). I think that's not terrible, but :shrug:.
 
29
  - Different learning rate and back to Celeste's scaling factor setup (but Celeste trained on -base, this is -instruct).
30
  - Now with added eval! I worked out how to get eval stats (and wandb) set up, so now I can see my failures in graphical form.
31
 
32
+ I pulled v7 because I honestly don't think it's as good as v6, and don't want folks to get the wrong idea that it's better just because the version number is higher.
33
+
34
  And of course yay Unsloth for letting this all train on a single A100 with variable (wildly variable) context length.
35
 
36
  Here's what the train/eval loss looked like (eval is orange, train is blue). I think that's not terrible, but :shrug:.