Congrats!
This fine tune is a work of art. It's super smart and super obedient to the system message, way better than 2.9.1.
I think we are getting closer and closer to close source with open source thanks to your great work! :)
I'd say we already beat them in a lot of use cases.
1 week with 8xH100's is crazy too, thats a lot of compute for a finetune. This seems like the real deal certainly!
How much does that cost? I wouldn't mind a WizardLM2-8x22b finetune like this
to rent 8xH100's for a week is roughly around 5 grand USD give or take average pricing
Possibly less, I guess it depends but the quotes im looking at are around there
1 week with 8xH100's is crazy too, thats a lot of compute for a finetune. This seems like the real deal certainly!
We have some new techniques for FFT we'll share soon - but in total this model took 3 days 22 hours to train.
oops, I think I forgot to update the model card there
I had no idea it was so expensive. I thought maybe a few hundred bucks...
Thanks for releasing these finetunes ehartford
The H100 is probably within the top 3 most powerful gpu's in the world right now. The H200 is king IIRC and I know AMD has something out to compete. Thus why i think its probably within the top 3 or 4.
I had no idea it was so expensive. I thought maybe a few hundred bucks...
Thanks for releasing these finetunes ehartford
We have a compute sponsor for most of these models, so while yes it’s very expensive - it’s not coming out of our pocket.
This fine tune is a work of art. It's super smart and super obedient to the system message, way better than 2.9.1.
I think we are getting closer and closer to close source with open source thanks to your great work! :)
I'd say we already beat them in a lot of use cases.
How smart actually?
How smart actually?
I am wondering if it would top the newest qwen model that just came out
Qwen2 is not yet released.
I really enjoy Dolphin 2.9.2 Mixtral 8x22b. For now it's my favorite Dolphin that's ever been released.
But there will absolutely be a Dolphin trained on Qwen2.
Ah, I thought I saw that it had been released on Reddit but I must have read it wrong. I tried quill which is supposedly an early version and it was decent.
Qwen2 is not yet released.
I really enjoy Dolphin 2.9.2 Mixtral 8x22b. For now it's my favorite Dolphin that's ever been released.
But there will absolutely be a Dolphin trained on Qwen2.
Will it be follow systems prompt good like this finetune?
And Qwen It's quite bad to often insert Chinese into answers, I hope Qwen 2 will fix it.
I hope this model be hosted somewhere so i can try it.
And Qwen It's quite bad to often insert Chinese into answers
Yes I have also witnessed this issue. It seems to plague the qwen models as I have tried other chinese made models and they do not do this.