Why Not Utilize a Sigmoid Function in the Regression Layer?

#8
by xwz-xmu - opened

The regression layer typically performs a linear transformation, which does not inherently constrain the range of the predicted rewards. Consequently, the model may output extremely large positive values or even negative ones, whereas the ground-truth rewards in the dataset are normalized to the range [0, 1].

I am curious whether omitting a sigmoid function in the regression layer could negatively impact the performance of multi-objective reward modeling.

xwz-xmu changed discussion title from why not use a sigmoid function on regression layer? to Why Not Utilize a Sigmoid Function in the Regression Layer?
RLHFlow org

You can try sigmoid. I tried logistic regression loss before (with sigmoid), and did not find it outperform regression. The regression on Llama backbone is pretty stable.

Haoxiang-Wang changed discussion status to closed

Sign up or log in to comment