output embeddings
#54 opened 4 months ago
by
pureve
output content
#53 opened 4 months ago
by
pureve
How to convert 4bit model back to fp16 data format?
3
#52 opened 8 months ago
by
tremblingbrain
add template
#51 opened 9 months ago
by
philschmid
torch.cuda.OutOfMemoryError: CUDA out of memory.
#50 opened 11 months ago
by
neo-benjamin
Can you please provide 'c4' version?
#49 opened 12 months ago
by
leeee1204
How much does it take to inference one sample?
#48 opened about 1 year ago
by
andreaKIM
Issues with CUDA and exllama_kernels
9
#47 opened about 1 year ago
by
ditchtech
Calling LlamaTokenizerFast.from_pretrained() with the path to a single file or url is not supported for this tokenizer. Use a model identifier or the path to a directory instead.
#46 opened about 1 year ago
by
kidrah-yxalag
Hallucination issue in Llama-2-13B-chat-GPTQ
7
#45 opened about 1 year ago
by
DivyanshTiwari7
Increasing the model's predefined max length
#44 opened about 1 year ago
by
MLconArtist
[AUTOMATED] Model Memory Requirements
#43 opened about 1 year ago
by
model-sizer-bot
Deploying TheBloke/Llama-2-13B-chat-GPTQ as a batch end point in sagemaker
#41 opened about 1 year ago
by
vinaykakara
Deploying this on Text Generation Inference (TGI) server on AWS SageMaker
1
#38 opened about 1 year ago
by
ZaydJamadar
Understanding materials
1
#37 opened about 1 year ago
by
rishabh-gurbani
Temperature or top_p is not working
2
#35 opened about 1 year ago
by
chintan4560
Train model with webui
1
#34 opened about 1 year ago
by
Samitoo
HuggingFace's bitsandbytes vs AutoGPTQ?
2
#33 opened about 1 year ago
by
chongcy
What library was used to quantize this model ?
1
#32 opened about 1 year ago
by
ImWolf7
Dataset used for quantisation
2
#31 opened about 1 year ago
by
CarlosAndrea
How to make it (Llama-2-13B-chat-GPTQ) work with Fastchat
4
#30 opened over 1 year ago
by
Vishvendra
Error: Transformers import module musicgen
#29 opened over 1 year ago
by
galdezanni
Finetuning the model using custom dataset.
#28 opened over 1 year ago
by
Varanasi5213
Necessary material for llama2
7
#27 opened over 1 year ago
by
Samitoo
Converting hf format model to 128g.safetensors
7
#26 opened over 1 year ago
by
goodromka
Llama-2-13B-chat-GPTQ problem
2
#23 opened over 1 year ago
by
nigsdf
Getting an error: AttributeError: module 'accelerate.utils' has no attribute 'modeling'. Please tell me what should i do?
#21 opened over 1 year ago
by
Dhairye
Getting error while loading model_basename = "gptq_model-8bit-128g"
7
#20 opened over 1 year ago
by
Pchaudhary
fine tune on custom chat dataset using QLORA & PEFT
3
#19 opened over 1 year ago
by
yashk92
General Update Question for LLMs
2
#17 opened over 1 year ago
by
Acrious
File not found error while loading model
19
#14 opened over 1 year ago
by
Osamarafique998
CPU Inference
1
#13 opened over 1 year ago
by
Ange09
Slow Inference Speed
#12 opened over 1 year ago
by
asifahmed
Error while loading model from path
3
#11 opened over 1 year ago
by
abhishekpandit
Censorship is hilarious
6
#10 opened over 1 year ago
by
tea-lover-418
why it says no quantize_config.json file but it has
6
#9 opened over 1 year ago
by
Mark000111888
Error loading model from a different branch with revision
9
#8 opened over 1 year ago
by
amitj
Llama v2 GPTQ context length
6
#7 opened over 1 year ago
by
andrewsameh
Is this model based on `chat` or `chat-hf` model of llama2?
3
#6 opened over 1 year ago
by
pootow
Prompt format
8
#5 opened over 1 year ago
by
mr96
Bravo! That was fast : )
2
#3 opened over 1 year ago
by
jacobgoldenart
Doesn't contain the files
3
#1 opened over 1 year ago
by
aminedjeghri