Flash_attn not able to build in the Local environment for the Model inference.

#71
by siddharth-magesh - opened

When i use this model on the Colab , the inference is happening fine , but once I load it in local , it pops a error message called "pip install flash_attn".
Once I run it in my CLI , the build wheel is not able to set up.
i tried with multiple versions of flash_attn and none of it was able to build.
I tried few methods and it doesn't work out, such as "pip install flash-attn==2.5.5 --no-build-isolation"
I am using python 3.10 and torch==2.3.1 with cuda 11.8
what could be the possible issue here ?

Sign up or log in to comment