Can't find zephyr-7b-beta cache using optimum cli list command.

#21

by Anurag2132 - opened Mar 14

Mar 14

I am a beginner, facing issues with finding and loading the cache files that i need for zephyr-7b-beta. I am using the commands given on the guides, but getting issues like repo not found. Can someone please help with that. As in give the exact commands to find and load the models I mentioned.

dacorvo

AWS Inferentia and Trainium org Mar 14

Hi, please make sure you have the latest version of optimum-neuron installed:

$ pip install -U optimum-neuron

Then type:

$ optimum-cli neuron cache lookup HuggingFaceH4/zephyr-7b-beta

*** 0 entrie(s) found in cache for HuggingFaceH4/zephyr-7b-beta for training.*** 

*** 12 entrie(s) found in cache for HuggingFaceH4/zephyr-7b-beta for inference.*** 
...

Anurag2132

Mar 14

Hey, thank you for the response. I get this when I try that: optimum-cli neuron cache lookup HuggingFaceH4/zephyr-7b-beta
usage: optimum-cli neuron cache [-h] {create,set,add,list,synchronize} ...
optimum-cli neuron cache: error: argument {create,set,add,list,synchronize}: invalid choice: 'lookup' (choose from 'create', 'set', 'add', 'list', 'synchronize')

does lookup not work on Inf2?

dacorvo

AWS Inferentia and Trainium org Mar 14

You don't seem to have the latest version of optimum-neuron (0.0.20).

$ pip show optimum-neuron
Name: optimum-neuron
Version: 0.0.20
...
$ optimum-cli neuron cache -h
usage: optimum-cli neuron cache [-h] {create,set,add,synchronize,lookup} ...

positional arguments:
  {create,set,add,synchronize,lookup}
    create              Create a model repo on the Hugging Face Hub to store Neuron X compilation files.
    set                 Set the name of the Neuron cache repo to use locally (trainium only).
    add                 Add a model to the cache of your choice (trainium only).
    synchronize         Synchronize the neuronx compiler cache with a hub cache repo.
    lookup              Lookup the neuronx compiler hub cache for the specified model id.

options:
  -h, --help            show this help message and exit

Anurag2132

Mar 14

Thank you, updating optimum worked.
Is there also a way to download or load the neff files to my local environment so that I don't have to export a model? Sorry if it is a stupid question, this is not really my domain..

dacorvo

AWS Inferentia and Trainium org Mar 14

If you export the model for one of the cached configuration (batch_size, sequence_length, auto_cast_type, num_cores), then the cached NEFFS will be fetched automatically (you'll see messages on the console).

dacorvo changed discussion status to closed Apr 17

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment