vincentoh/llama3-70b-GGUF


# code 
https://huggingface.co/vincentoh/llama3_70b_no_robot_fsdp_qlora


# model
wget "https://huggingface.co/vincentoh/llama3-70b-GGUF/blob/main/vincentoh/llama3-70b-GGUF"

# memory usage
Thu May 16 15:53:07 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA H100 PCIe               On  | 00000000:08:00.0 Off |                    0 |
| N/A   37C    P0              76W / 350W |  40441MiB / 81559MiB |     24%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     17735      C   ./main                                    40428MiB |
+---------------------------------------------------------------------------------------+


# token speed
<|begin_of_text|>Why is the sky blue? The sky is blue due to a phenomenon called Rayleigh scattering. This scattering refers to the scattering of electromagnetic radiation (light) by particles much smaller than the wavelength of the light. The short-wavelength blue light is scattered more than the other colors of visible light, resulting in more blue light reaching the observer than the other colors of light.<|end_of_text|> [end of text]

llama_print_timings:        load time =    6244.37 ms
llama_print_timings:      sample time =       4.39 ms /    69 runs   (    0.06 ms per token, 15710.38 tokens per second)
llama_print_timings: prompt eval time =      90.86 ms /     7 tokens (   12.98 ms per token,    77.05 tokens per second)
llama_print_timings:        eval time =    2334.73 ms /    68 runs   (   34.33 ms per token,    29.13 tokens per second)
llama_print_timings:       total time =    2486.72 ms /    75 tokens
Log end