Installing ! pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary but flah_llama still erroring out
I have installed all the required dependencies to run flash attn.:
! pip install flash-attn --no-build-isolation
! pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map='auto', trust_remote_code=True, torch_dtype=torch.bfloat16, revision="refs/pr/17")
This is not working. Error:
ImportError: Please install RoPE kernels: pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
I have already installed this dependency.
Output of:
model = AutoModelForCausalLM.from_pretrained(MODEL_ID, device_map='auto', trust_remote_code=True, torch_dtype=torch.bfloat16)
Downloading (…)lve/main/config.json: 100%
709/709 [00:00<00:00, 62.2kB/s]
Downloading (…)eling_flash_llama.py: 100%
45.3k/45.3k [00:00<00:00, 3.74MB/s]
A new version of the following files was downloaded from https://huggingface.co/togethercomputer/LLaMA-2-7B-32K:
- modeling_flash_llama.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
>>>> Flash Attention installed
ModuleNotFoundError Traceback (most recent call last)
~/.cache/huggingface/modules/transformers_modules/togethercomputer/LLaMA-2-7B-32K/aef6d8946ae1015bdb65c478a2dd73b58daaef47/modeling_flash_llama.py in
51 try:
---> 52 from flash_attn.layers.rotary import apply_rotary_emb_func
53 flash_rope_installed = True
12 frames
ModuleNotFoundError: No module named 'flash_attn.ops.triton'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
~/.cache/huggingface/modules/transformers_modules/togethercomputer/LLaMA-2-7B-32K/aef6d8946ae1015bdb65c478a2dd73b58daaef47/modeling_flash_llama.py in
55 except ImportError:
56 flash_rope_installed = False
---> 57 raise ImportError('Please install RoPE kernels: pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
')
58
59
ImportError: Please install RoPE kernels: pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary
Currently a bug in flash-attn. Try installing v2.1.1 for now:
pip install flash-attn==2.1.1 --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git@v2.1.1#subdirectory=csrc/rotary
that worked... thanks
how does one figure this out by themselves :)