Patching hf bug that creates wrong cache length if only inputs_embeds are passed to the model

#19
NVIDIA org
No description provided.
itlevy changed pull request status to merged

Sign up or log in to comment