Spaces:
Running
on
T4
Running
on
T4
Can I finetune it on any other language. What should be the dataset structure for the same.
#6
by
smjain
- opened
Can I finetune it on any other language. What should be the dataset structure for the same. Can you provide some pointers
You can find a script for fine-tuning SantaCoder here, it allows you to fine-tune on text datasets like other programming languages of The Stack, but there's no guarantee the model can pick up a new language it wasn't pre-trained on just by fine-tuning. You could try some other tasks like python to text translation with this dataset or try fine-tuning on code classification tasks from CodexGlue for example (some fine-tuning scripts are available here).