mohitsha
/

Llama-2-7b-chat-hf-AMMO-TRT

Model card Files Files and versions Community

Llama-2-7b-chat-hf-AMMO-TRT / README.md

mohitsha's picture

mohitsha HF staff

Update README.md

b11cd75 verified 5 months ago

|

history blame contribute delete

152 Bytes

LLama2 Model with FP8 KV Cache checkpoint for TRTLM

Generated using https://github.com/vllm-project/vllm/blob/main/examples/fp8/quantizer/quantize.py