Steps to the creation of this model
Hi! Thank you for your excellent youtube video on Whisper fine-tuning. In the video, after fine-tuning, you choose a checkpoint, merge, and upload. How did you gather all the files in this repo? I think tokenizer.json for instance is not created in the checkpoint, did you copy it from the original model and put it here?
Thanks!
Nuno
Howdy you can just save the tokenizer and then push that and it should push the tokenizer files
Hi! I did just that experiment, to know what in fact was being saved and when:
Saving the processor:
Saving the tokenizer:
The tokenizer just overwrites the files of the processor, except preprocessor_config.
I had to manually copy the tokenizer.json of the original model for the model to be usable (i'm converting using ctranslate2 and using it with faster-whisper). I can see that you have the tokenizer.json in your repo so I was wondering what what was the workflow you used. At least in my case it is not saving the tokenizer.json file. This is a large-v3 finetuning, and it's a local copy only, not interested in pushing it to HF.
Thanks for the help,
Nuno
interesting, thanks for sharing that. Yeah it is possible I just copied pasted the json as you did.