i recently saw something on schemas :
they were giving thr model various schemas to teach it how to choose solution pathways to solve complex equaions or math problems wich contain complex steps . they gave it a schema of how to calculate volumes for objects so given a circle it will go down a path way with all the correct formulas for cirecles and circumfraance and diameter giving it the components required for producing duch a calculation : For each shape type ! they had high level and very detailed level : so basically all maths is methodolgys and mental aritmatic ! so the model should be able to complete this task but gets lost in sub calculations etc :
i have seen some research tools which produce case studys or reports or essays or various types of specific wrtten articles and this is acheived by teaching the model the schema for these project , or hard coding a chain : so this can be used to produce the correct data : as for an essay we have a format and a dissertation a more complexed project with findings ( which has subsections and sub prokjects and many considerations) so if you can teach the model how to make components , such as titles, sumarys , create short prose , write paragraphs , transpse sentences etc then these smaller tasks woulld form the back end for the models abilitys : then when you train the model to create storys ie that model will have to recive a set of prompts and prodiuice a set of original outputs good and bad , then these outputs would be used as synthetic data , you would also got all gpts and ask them to produce various types of project and short texts ie a example chapter on ...and example pargraph on ... folow on from this story to the end of the chapter ( given a sumarized backstory ) .etc: tech the model how to extract information from text such as outline the charcters in this text , identify the motives of this char in the text , ask the model how should the protagonist have done a or b :
tell the model to create storyboards etc ( use other gpts to produce the synthetic data) ... then the model can plan a story by first creating a story board then use the board to write the story ... so correct planning :
these can also be found on w3sholls... so for such answer it would need w3schoools acess etc??... to learnhow to write a novel for instance as aresearch task ! so for a highly connected model for a writing toask it can plan oand reserarch , create a model then write ! !<<
so its no easy task to train a creative writing model that needs no RAG !<< ( all is fine tuned) ....
the question is what does a model do with a Corpus of raw text ? how should it handle it ? and how do you want ot retrive it ... hence on data coming in it should be (probaly have enitys identifed and topics identified ( some gramatical task ) enable for the model to identify the usefiull components ... soi later when dropping data in the model will use these tactics internally to craft the responses and segment the sequential dat and associate it to some task !
pretraiing is very RAW ! ( and has none of these nuances ) << so when you fine tune a model for this purpose it sugestable that it is a specialized model !