Baki

hangingardens

AI & ML interests

None yet

Recent Activity

Organizations

None yet

hangingardens's activity

replied to louisbrulenaudet's post about 1 month ago
view reply

i have a project idea to build a reasoning model like o1, while there are many projects exist like that mine is a bit different, first of all I create some manual data myself and then train using a new formulation based on reinforce algorithm, its simple btw, I included a sample from my data below also, I can get help for these: 1. fine tuning models like mistral large or llama 3 using the data
2. creating more data using existing models which I already have a prompt ready, which will work like I prompt model to use this format and style of my samples I created manually, note, this does not work for hard questions, I have to manually do them sometimes, that samples is my longest sample:

: {
"user": "solve this sudoku: 000005920000902803000074000096210507057006000284050690670000409830000752040020316",
"model": {
"steps": {
"step1": "lets start with some one has most rows, 3x3 block or column filled",
"step2": "okay, 9. 3x3 block has only one number left",
"step3": "1, 2, 3, 4, 5, 6, 7, 9 are there",
…………….
"step105": "and they get filled by these numbers in order: 6 and 5",
"step106": "so full grid finished at end is: 763185924415962873928374165396218547157496238284753691672531489831649752549827316, and this is the answer."
},
"validity": [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
"answer": "The full grid as fully solved is: 763185924415962873928374165396218547157496238284753691672531489831649752549827316"
}
},

New activity in EleutherAI/pythia-70m 2 months ago

Prompt Template

1
#4 opened 3 months ago by ha1772007