vocab.txt is missing
#18
by
trinisim
- opened
Without vocab.txt I can use the fast tokenizer:
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
But if I try to use the non-fast tokenizer:
tokenizer=AutoTokenizer.from_pretrained("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",use_fast=False)
I see the following error:
File "/sgreene/python/transformers/tokenization_utils_base.py", line 2048, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/sgreene/python/transformers/transformers/tokenization_utils_base.py", line 2287, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sgreene/python/transformers/models/bert/tokenization_bert.py", line 199, in __init__
if not os.path.isfile(vocab_file):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen genericpath>", line 30, in isfile
I am unable to use the fast tokenizer in my project.