Getting "Killed" out of memeory after shards is executed

#119
by nitin1607 - opened

Getting below error:
Compiling model meta-llama/Meta-Llama-3.1-8B-Instruct int8 for the NPU
Loading checkpoint shards: 50%|████████████████████████████████████████████ | 2/4 [00:04<00:04, 2.15s/it]
Killed

I can see below error using dmesg :
[ 4570.797602] out_of_memory+0x103/0x340
[ 4570.798073] Out of memory: Killed process 8600 (pt_main_thread) total-vm:41954960kB, anon-rss:28691156kB, file-rss:104kB, shmem-rss:0kB, UID:0 pgtables:68224kB oom_score_adj:0

I am currently using machine with 32GB RAM .
Any comment regarding this will be helpful as i am new to using this.

Thank you !

Are you running this on a cpu or gpu? In otherwords, device set to 'cuda', 'cuda:0', 'auto', or 'cpu'?

@rkapuaala Thank you for your reply.
I am running this on NPU using Linux machine . I have installed https://github.com/intel/intel-npu-acceleration-library and running llama3/3.1 on top of it and getting this issue.
Let me know, if i had missed something.

Yeah, sorry I missed that on the first line of the error message until you told me. I would recomment you provide more specs on your NPU and possibly some more specs on your linux box and any devices running on there. What you're doing is way out of my league, and not at all like my error messages which is why asked about it. Good luck.

Yeah, sorry I missed that on the first line of the error message until you told me. I would recomment you provide more specs on your NPU and possibly some more specs on your linux box and any devices running on there. What you're doing is way out of my league, and not at all like my error messages which is why asked about it. Good luck.

Thank you for your reply and comments !

Sign up or log in to comment