cxk

notimetosleep

AI & ML interests

None yet

Recent Activity

liked a model 28 days ago

tencent/Tencent-Hunyuan-Large

View all activity

Organizations

None yet

notimetosleep's activity

liked a model 28 days ago

tencent/Tencent-Hunyuan-Large

Text Generation • Updated 9 days ago • 302 • 480

liked a model 4 months ago

THUDM/CogVideoX-2b

Text-to-Video • Updated 10 days ago • 51.7k • 304

liked 5 datasets 4 months ago

liked a model 4 months ago

allenai/OLMo-7B-0424

Text Generation • Updated Jul 30 • 246 • 45

liked a dataset 4 months ago

allenai/dolma

Updated Apr 17 • 1.64k • 854

liked 2 models 4 months ago

meta-llama/Llama-3.1-405B-Instruct

Text Generation • Updated Sep 25 • 216k • 537

meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25 • 6.3M • • 3.17k

reacted to mlabonne's post with 👍 5 months ago

Post

16253

Large models are surprisingly bad storytellers.

I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."

Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.

In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.

Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.

I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.

Do you know why smaller models outperform the frontier models here?