Temp, Will Delete Soon

by deleted - opened Jun 8

Discussion

deleted

Jun 8

This comment has been hidden

deleted changed discussion status to closed Jun 8

deleted

Jun 8

This comment has been hidden

ehartford

Cognitive Computations org Jun 8

@Phil337 Go create something.

deleted

Jun 8

@ehartford Not relevant.

ehartford

Cognitive Computations org Jun 8

Your whining complaining isn't relevant

deleted

Jun 8

@ehartford Sorry. Keep up the good work. You've created a lot of good things here.

Reading over my past comments it's clear that I'm complaining more than testing and giving constructive feedback. But to be fair, people don't seem to like criticism, valid or not, least of all you.

And certainly not accusations of cheating, which I've done several times. But seriously, a yi-34 with an MMLU of 85.6. Do I really have to create something before being allowed to accuse them of cheating?

https://huggingface.co/CausalLM/34b-beta

Anyway. I'm out. I'm not posting another thing to HF unless someone asks for a response. But know that every complaint I made was honest and based on countless hours of careful testing, and I would never make an accusation of cheating unless the odds were >99.9%. It's impossible to fine-tune a Yi-34b 77 MMLU base to an 85 MMLU LLM and you know it, yet you jumped down my throat.

Thus concludes my whining complaint. Take care.

HiroseKoichi

Jun 9

Forgive me if I'm wrong, but wasn't your original comment telling him not to thank Elon Musk and how you've been seeing more conspiracy theories on Twitter? I don't quite think that's constructive feedback on models...

deleted

Jun 9

@HiroseKoichi Yes, you're right. But we exchange hostile words for months, which included him repeatedly saying the exact phrase "Go create something". Although the sentiment is far too elitist and stupid for me to take seriously (no voice unless you create), it's still time for me to leave.

I'm growing progressively more frustrated trying to test modern models like Phi, Yi, and Qwen. They're discarding the bulk of popular knowledge to boost their MMLU scores at the same parameter count and training resources, so when I try to test them they hallucinate so frequently and badly I'm spending hours looking up their responses (e.g. an 18th century author stared in a popular 1990s movie). Frankly, it's cheating. Anybody, including Mistral and Meta, could have done the same. And to watch them brag about beating them is just too much.

nlpguy

Jun 9

It was nice seeing you test models Phil. Don't underestimate the value your comments had. For both Model creators and users. Goodbye.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment