athirdpath
commited on
Commit
•
8f7378e
1
Parent(s):
a032368
Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
Oh no, he's dumb too! I have a working hypothesis,
|
2 |
|
3 |
### Recipe
|
4 |
merge_method: dare_ties
|
|
|
1 |
+
Oh no, he's dumb too! I have a working hypothesis. Inverting and merging 20b Llama 2 models works quite well, evening out the gradients between slices. However, these 13b Mistrals seem to HATE it, I assume due to the unbalanced nature of my recipe. More study is required.
|
2 |
|
3 |
### Recipe
|
4 |
merge_method: dare_ties
|