Using accelerating libraries such DeepSpeed

by marccasals - opened Jul 4, 2023

Jul 4, 2023

I am trying to load the model using the deepspeed library. Is it possible to optimize this model using this library? I have tried setting

replace_with_kernel_inject=True

But it duplicated the amount of GPU Ram needed. Is there any solution?

young-geng

OpenLM Research org Jul 7, 2023

When evaluating with lm-eval-harness, it seems that the model does support using accelerate. After all this model should be fully compatible with LLaMA so any inference tricks for LLaMA should apply to OpenLLaMA

young-geng changed discussion status to closed Jul 7, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment