Requirements
#52 opened 3 months ago
by
sneakybeaky
Help Needed: Installing & Running Llama 3: 70B (140GB) on Dual RTX 4090 & 64GB RAM
#51 opened 5 months ago
by
kirushake
Llama Batch Inference of Llama-2-70B-Chat-GPTQ
#50 opened 8 months ago
by
Ivy111
Adding Evaluation Results
#49 opened 8 months ago
by
leaderboard-pr-bot
Adding Evaluation Results
#48 opened 12 months ago
by
leaderboard-pr-bot
Finetuning llama2
#47 opened about 1 year ago
by
zuhashaik
Any example of batch inference?
#46 opened about 1 year ago
by
PrintScr
[AUTOMATED] Model Memory Requirements
#45 opened about 1 year ago
by
model-sizer-bot
Can we use AWS EC2 free tier instance for testing llama 2 7B chat model?
#43 opened about 1 year ago
by
haroonHF
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM @fschat 0.2.29, torch 2.0.1+cu118, transformers 4.33.3
#42 opened about 1 year ago
by
Zeal666
Memory consumption much higher on multi-GPU setup
1
#41 opened about 1 year ago
by
simonesartoni1
GCP system to host llama2 70B Chat model
#40 opened about 1 year ago
by
Hammad-Ahmad
how can i use the model to perform multigpu inference?
#39 opened about 1 year ago
by
weijie210
inference take more than 10 min
#38 opened about 1 year ago
by
shravanveldurthi
Out of memory error, but both system and GPU have plenty of memory
5
#37 opened about 1 year ago
by
mstachow
Group size is 128 or 1 for main branch?
8
#36 opened about 1 year ago
by
brendanlui
Error when running pipe: temp_state buffer is too small
3
#35 opened about 1 year ago
by
StefanStroescu
Performance Drop due to quantization?
4
#34 opened about 1 year ago
by
Teja-Gollapudi
What GPU and RAM is needed for Llama-2-70B-chat-70B(int 8 or fp16) ?
#33 opened about 1 year ago
by
yanmengxiang666
in-context learing in LLama2,thanks!
#32 opened about 1 year ago
by
yanmengxiang666
How to set max_split_size_mb?
1
#30 opened over 1 year ago
by
neo-benjamin
max_position_embeddings = 2048?
1
#29 opened over 1 year ago
by
zzzac
Load into 2 GPUs
3
#28 opened over 1 year ago
by
sauravm8
Load model into TGI
#27 opened over 1 year ago
by
schauppi
RuntimeError: shape '[4, 226, 24576]' is invalid for input of size 9256960
4
#26 opened over 1 year ago
by
linkai-dl
Why the input prompt is part of the output?
3
#25 opened over 1 year ago
by
neo-benjamin
What does it mean by the inject_fused_attention disabled for 70B model?
1
#24 opened over 1 year ago
by
neo-benjamin
Generating nonsense output and then broke
1
#23 opened over 1 year ago
by
joycejiang
Perplexity
#22 opened over 1 year ago
by
gsaivinay
70TB with multiple A5000
6
#21 opened over 1 year ago
by
nashid
error: unexpected keyword argument 'inject_fused_attention'
3
#19 opened over 1 year ago
by
lasalH
Inference error, tensor shapes.
8
#18 opened over 1 year ago
by
alejandrofdz
update to latest transformers and exllama, still loading fail
2
#17 opened over 1 year ago
by
yiouyou
llama.cpp just added GQA and full support for 70B LLaMA-2
2
#16 opened over 1 year ago
by
igzbar
Inference time with TGI
1
#15 opened over 1 year ago
by
jacktenyx
Can't launch with TGI
6
#14 opened over 1 year ago
by
yekta
output is merely copy of input for 70b @ webui
1
#13 opened over 1 year ago
by
wholehope
Error encountered: CUDA extension not installed while running.
1
#12 opened over 1 year ago
by
wempoo
can u show the settings for quantizing the model?
8
#11 opened over 1 year ago
by
hugginglaoda
ValueError: not enough values to unpack (expected 3, got 2)
1
#10 opened over 1 year ago
by
Esin
Further update with slight improvements to the prompt template, also removed the system message
1
#9 opened over 1 year ago
by
clayp
Bloke - add 70B ggml version please
4
#8 opened over 1 year ago
by
mirek190
ExLlama is not working, received "shape '[1, 64, 64, 128]' is invalid for input of size 65536" error
2
#6 opened over 1 year ago
by
charleyzhuyi
text-generation-inference error
7
#5 opened over 1 year ago
by
msteele
Output always 0 tokens
11
#4 opened over 1 year ago
by
sterogn
What GPU is needed for this 70B one?
27
#2 opened over 1 year ago
by
RageshAntony
It doesn't work with Exllama at the moment
2
#1 opened over 1 year ago
by
Shouyi987