Why Not Utilize a Sigmoid Function in the Regression Layer?

by xwz-xmu - opened Jun 27

Jun 27

•

The regression layer typically performs a linear transformation, which does not inherently constrain the range of the predicted rewards. Consequently, the model may output extremely large positive values or even negative ones, whereas the ground-truth rewards in the dataset are normalized to the range [0, 1].

I am curious whether omitting a sigmoid function in the regression layer could negatively impact the performance of multi-objective reward modeling.

xwz-xmu changed discussion title from why not use a sigmoid function on regression layer? to Why Not Utilize a Sigmoid Function in the Regression Layer? Jun 27

Haoxiang-Wang

RLHFlow org Jun 27

You can try sigmoid. I tried logistic regression loss before (with sigmoid), and did not find it outperform regression. The regression on Llama backbone is pretty stable.

Haoxiang-Wang changed discussion status to closed Jun 27

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment