the signal array doesn't match the model?
i just copied the codes to colab, run it and get an error
RuntimeError: Given groups=1, weight of size [512, 1, 10], expected input[1, 16000, 1] to have 1 channels, but got 16000 channels instead
what is the problem?
I run the code and for me it works. From the error message it looks like the input to the model has a wrong shape, namely [16000, 1]
instead of [1, 16000]
.
I run the code and for me it works. From the error message it looks like the input to the model has a wrong shape, namely
[16000, 1]
instead of[1, 16000]
.
the shape of np.zeros((1, sampling_rate), dtype=np.float32) is [1, 16000]. why does your code work and mine doesn't......
i change signal = np.zeros((1, sampling_rate), dtype=np.float32) to signal = np.zeros((sampling_rate, 1), dtype=np.float32)
and it gets a different error "Calculated padded input size per channel: (1). Kernel size: (10). Kernel size can't be greater than actual input size"....
Mhh strange. We have a more in-depth tutorial on the model at https://github.com/audeering/w2v2-how-to. Maybe have a look there and see if that one works for you (it's using an ONNX export instead of the original Torch model).
@koukou123 The input dimensions in the following line is wrong:
outputs = self.wav2vec2(input_values)
Try changing the dimensions as follows. It worked for me:
outputs = self.wav2vec2(input_values.reshape(1, 16000))
@koukou123 The input dimensions in the following line is wrong:
outputs = self.wav2vec2(input_values)
Try changing the dimensions as follows. It worked for me:
outputs = self.wav2vec2(input_values.reshape(1, 16000))
it works!!!!!!! thank you!!!!
I have updated the model card accordingly. Thanks for reporting, will close the issue now.