the signal array doesn't match the model?

by koukou123 - opened

i just copied the codes to colab, run it and get an error
RuntimeError: Given groups=1, weight of size [512, 1, 10], expected input[1, 16000, 1] to have 1 channels, but got 16000 channels instead

what is the problem?

audEERING GmbH org

I run the code and for me it works. From the error message it looks like the input to the model has a wrong shape, namely [16000, 1]instead of [1, 16000].

I run the code and for me it works. From the error message it looks like the input to the model has a wrong shape, namely [16000, 1]instead of [1, 16000].

the shape of np.zeros((1, sampling_rate), dtype=np.float32) is [1, 16000]. why does your code work and mine doesn't......

i change signal = np.zeros((1, sampling_rate), dtype=np.float32) to signal = np.zeros((sampling_rate, 1), dtype=np.float32)
and it gets a different error "Calculated padded input size per channel: (1). Kernel size: (10). Kernel size can't be greater than actual input size"....

audEERING GmbH org

Mhh strange. We have a more in-depth tutorial on the model at Maybe have a look there and see if that one works for you (it's using an ONNX export instead of the original Torch model).

@koukou123 The input dimensions in the following line is wrong:

outputs = self.wav2vec2(input_values)

Try changing the dimensions as follows. It worked for me:

outputs = self.wav2vec2(input_values.reshape(1, 16000))

@koukou123 The input dimensions in the following line is wrong:

outputs = self.wav2vec2(input_values)

Try changing the dimensions as follows. It worked for me:

outputs = self.wav2vec2(input_values.reshape(1, 16000))

it works!!!!!!! thank you!!!!

audEERING GmbH org

I have updated the model card accordingly. Thanks for reporting, will close the issue now.

frankenjoe changed discussion status to closed

Sign up or log in to comment