Not working in Text Generation Web UI
I tried all model loaders with this model but it failed to load. Any ideas how to get it to load? Thanks.
i run it no problem using the blokes quantized version in gguf file format using llama.cpp loader. I use Q6_k quantized file. i get about 10 tokens/s
Try on this Colab: https://colab.research.google.com/drive/18XH8DTbgI4Zrsg-Xat-El3FvL8ZIDXMD
Change
llm_chain = LLMChain(prompt=prompt,
llm=HuggingFaceHub(repo_id="google/flan-t5-xl",
model_kwargs={"temperature":0,
"max_length":64}))
question = " what is capital of France?"
print(llm_chain.run(question))
to
llm_chain = LLMChain(prompt=prompt,
llm=HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-alpha",
model_kwargs={"temperature":0.7, # NOTE
"max_length":64}))
question = " what is capital of France?"
print(llm_chain.run(question))
Answers were OK not compared to this online chat https://huggingface.co/spaces/HuggingFaceH4/zephyr-chat