wtf response? :D

#1
by Ukro - opened

server.py --auto-devices --model_type LLaMa --chat --wbits 4 --groupsize 128
Input: hello
Output:
{

"id": 1,

"name": "Joe",

"email": "joe@example.com"
}

and so on...

}

The response is an object with the following structure:

Logs:
Starting the web UI...
bin h:\0_oobabooga\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll
INFO:Loading TheBloke_guanaco-13B-GPTQ...
INFO:Found the following quantized model: models\TheBloke_guanaco-13B-GPTQ\Guanaco-13B-GPTQ-4bit-128g.no-act-order.safetensors
INFO:Loaded the model in 6.78 seconds.

INFO:Loading the extension "gallery"...
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Output generated in 27.31 seconds (2.05 tokens/s, 56 tokens, context 14, seed 1830835794)
Output generated in 6.82 seconds (1.76 tokens/s, 12 tokens, context 86, seed 1347457478)

You need to use a prompt template with these models

In text-gen-ui in the bottom left there's a "Prompt" dropdown box. Choose "Alpaca" and then enter your prompt in the template it provides, eg:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Hello, how are you?

### Response:

Working!
Thank you ! <3 <3 <3

Ukro changed discussion status to closed

Sign up or log in to comment