How to get this running on Oobabooga with RTX 4080 16GB?
Thanks I hope someone can give me guidance.
I want to run "alpaca-30b-4bit-128g.safetensors" which I think is the best model Alpaca-30b-lora-int4 right?
I have RTX 4080 and 64GB of RAM
I want to split this between GPU and CPU/System Memory if it supports it in Oobabooga.
OR run it on CPU only if I can't split it.
I copied the repository into ---> text-generation-webui
I am using the most up to date version of Oobabooga updated via git.
When I load the "alpaca-30b-4bit-128g.safetensors" I can make it start to fill VRAM or System RAM but it always crashes with various errors neat end.
I don't know what to set Model Type too? Or Wbits or Groupsize? I assume groupsize 128. I have tried checking auto devices or CPU etc. Generally I can't get this to work at all. Various errors mostly trace back errors.
This is not the only kind of error I get.
Traceback (most recent call last):
File “F:\AI2\oobabooga-windowsBest\text-generation-webui\server.py”, line 100, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “F:\AI2\oobabooga-windowsBest\text-generation-webui\modules\models.py”, line 208, in load_model
tokenizer = LlamaTokenizer.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}/“), clean_up_tokenization_spaces=True)
File “F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1811, in from_pretrained
return cls.from_pretrained(
File “F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py”, line 1965, in from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File “F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\models\llama\tokenization_llama.py”, line 96, in init
self.sp_model.Load(vocab_file)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\sentencepiece_init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\sentencepiece_init.py”, line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
python server.py --model model-folder-name --wbits 4 --groupsize 128 --model_type llama
Remove the --groupsize
parameter if you're using the non-grouped version.
Thank you Elinas.
I just tried what you suggested.
In edit for Start-webui my arguments command line is.
python server.py --model alpaca-30b-lora-int4 --wbits 4 --groupsize 128 --model_type llama
When I boot I get this error in the CMD/command window
Starting the web UI...
Gradio HTTP request redirected to localhost :)
Loading alpaca-30b-lora-int4...
Found the following quantized model: models\alpaca-30b-lora-int4\alpaca-30b-4bit-128g.safetensors
Loading model ...
Done.
Traceback (most recent call last):
File "F:\AI2\oobabooga-windowsBest\text-generation-webui\server.py", line 917, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "F:\AI2\oobabooga-windowsBest\text-generation-webui\modules\models.py", line 208, in load_model
tokenizer = LlamaTokenizer.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}/"), clean_up_tokenization_spaces=True)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py", line 1811, in from_pretrained
return cls.from_pretrained(
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\tokenization_utils_base.py", line 1965, in from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\transformers\models\llama\tokenization_llama.py", line 96, in init
self.sp_model.Load(vocab_file)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\sentencepiece_init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "F:\AI2\oobabooga-windowsBest\installer_files\env\lib\site-packages\sentencepiece_init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
Press any key to continue . . .
And of course if I click to continue it closes?
Any ideas?
Here is a small update. I was even given advice to try adding layers into my command line.
My command line is
python server.py --auto-devices --chat --pre_layer 31 --model alpaca-30b-lora-int4 --wbits 4 --groupsize 128 --model_type llama
But I have tried with and without --auto-devices and I have changed --pre_layer from like 10,20,30,40,50 etc.
But I still get the same error as above.
What transformers version are you running? These quantizations are only compatible with 4.27.0.dev0
I installed oobabooga and then updated it I got several other models working. I don't know what transformer are. And I don't know anything about "These quantizations are only compatible with 4.27.0.dev0" I have no idea what that means.
If it's to difficult to help me that is okay. If there is better software than oobabooga let me know if you recommend that. Thanks Elinas.
I prefer Kobold AI for storywriting + connecting it with TavernAI/SillyTavern for chatbots (as a frontend). Though it's a fork for 4-bit support right now, not in the official branch https://github.com/0cc4m/KoboldAI
Ooba updated to 4.28.0 I believe which broke older models. You can check your version by doing pip list
then looking for a transformers
package. Since it looks like you're using Windows, make sure to run that in the minniconda bat file that launches the environment, or activate the environment if you installed it some other way.
Thanks. I might give up on this. You are right I am using windows. I use git to update. I assume I am running the most recent version.
I tried cmd and pip list.
Package Version
numpy 1.24.2
opencv-python 4.7.0.68
pip 23.0.1
setuptools 63.2.0
I can't figure out were else to run pip list. But regardless I assume I have 4.28. So I am likely out of luck. They need to fix it so old models work. Or I need to find a different solution like Kobald AI? I have that installed. I might try that later. Anyhow thanks your welcome to close this if you want.
You're better off using KAI if you want backwards compatibility for now. You'll need to create a new installation using the git repo I linked above.
Thanks I appreciate it.