Work on a paper

by emozilla - opened Jun 26, 2023

Jun 26, 2023

This insight is just great -- best kind of optimization (novel but intuitively understandable in retrospect). I've done some additional work on it and was wondering if you'd want to colab on publishing a short paper. You can dm me @theemozilla on twitter (also my discord username). I wouldn't want to publish anything without you as an author. Lemme know!

kaiokendev

Owner Jun 26, 2023

@emozilla Hello, I do not use Twitter lol. You can email me at kaiokendev1@gmail.com

mallorbc

Jun 26, 2023

Been looking at the work here and the associated blog post, as well as the work here, https://huggingface.co/emozilla/open_llama_7b-scaled.

The idea makes sense to me, but in testing the open_llama_7b-scaled, I get poor results when I increase the context window.

Does the model and method require further fine-tuning? With the openllama, I did not further tune the model.

HAvietisov

Jun 29, 2023

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment