2 3

Baki

hangingardens

AI & ML interests

None yet

Recent Activity

New activity 24 days ago

Skylion007/openwebtext:whats the most anyone has trainined this dataset what was the best loss they reached

liked a model 24 days ago

mistralai/Mistral-Large-Instruct-2411

updated a model 25 days ago

hangingardens/llama-ioi

View all activity

Organizations

None yet

hangingardens's activity

New activity in Skylion007/openwebtext 24 days ago

whats the most anyone has trainined this dataset what was the best loss they reached

#17 opened about 2 months ago by

sahil26k

liked a model 24 days ago

mistralai/Mistral-Large-Instruct-2411

Updated 23 days ago • 551k • 165

updated a model 25 days ago

hangingardens/llama-ioi

Text Generation • Updated 25 days ago • 52

liked a dataset 26 days ago

microsoft/orca-agentinstruct-1M-v1

Viewer • Updated Nov 1 • 1.05M • 12.2k • 397

updated 3 models 26 days ago

replied to louisbrulenaudet's post about 1 month ago

i have a project idea to build a reasoning model like o1, while there are many projects exist like that mine is a bit different, first of all I create some manual data myself and then train using a new formulation based on reinforce algorithm, its simple btw, I included a sample from my data below also, I can get help for these: 1. fine tuning models like mistral large or llama 3 using the data
2. creating more data using existing models which I already have a prompt ready, which will work like I prompt model to use this format and style of my samples I created manually, note, this does not work for hard questions, I have to manually do them sometimes, that samples is my longest sample:

: {
"user": "solve this sudoku: 000005920000902803000074000096210507057006000284050690670000409830000752040020316",
"model": {
"steps": {
"step1": "lets start with some one has most rows, 3x3 block or column filled",
"step2": "okay, 9. 3x3 block has only one number left",
"step3": "1, 2, 3, 4, 5, 6, 7, 9 are there",
…………….
"step105": "and they get filled by these numbers in order: 6 and 5",
"step106": "so full grid finished at end is: 763185924415962873928374165396218547157496238284753691672531489831649752549827316, and this is the answer."
},
"validity": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
"answer": "The full grid as fully solved is: 763185924415962873928374165396218547157496238284753691672531489831649752549827316"
}
},