Microsoft Phi3 series gguf
Collection
GGUF quantized Microsoft Phi3 series
•
3 items
•
Updated
We use the same model from Microsoft microsoft/Phi-3-mini-4k-instruct-gguf
system_info: n_threads = 4 / 8 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
main: interactive mode on.
Reverse prompt: 'User:'
sampling:
repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 4096, n_batch = 2048, n_predict = 256, n_keep = 1
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to the AI.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision.
User: Hello, Bob.
Bob: Hello. How may I help you today?
User: Please tell me the largest city in Europe.
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
User:What is the largest city in Australia?
Bob: The largest city in Australia is Sydney.
User:What is the largest city in US?
Bob: The largest city in the United States by population is New York City.
User:thanks
Bob: You're welcome! If you have any more questions, feel free to ask.
Here's a transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. Additionally, Bob is proficient in providing detailed historical and cultural contexts for the information he provides.
User:
llama_print_timings: load time = 833.57 ms
llama_print_timings: sample time = 3.84 ms / 127 runs ( 0.03 ms per token, 33055.70 tokens per second)
llama_print_timings: prompt eval time = 14225.02 ms / 121 tokens ( 117.56 ms per token, 8.51 tokens per second)
llama_print_timings: eval time = 9098.34 ms / 124 runs ( 73.37 ms per token, 13.63 tokens per second)
llama_print_timings: total time = 70052.27 ms / 245 tokens