Papers
arxiv:2412.14922

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Published on Dec 19
· Submitted by luojunyu on Dec 24
#1 Paper of the day
Authors:
,
,
,
,

Abstract

Supervised fine-tuning (SFT) plays a crucial role in adapting large language models (LLMs) to specific domains or tasks. However, as demonstrated by empirical experiments, the collected data inevitably contains noise in practical applications, which poses significant challenges to model performance on downstream tasks. Therefore, there is an urgent need for a noise-robust SFT framework to enhance model capabilities in downstream tasks. To address this challenge, we introduce a robust SFT framework (RobustFT) that performs noise detection and relabeling on downstream task data. For noise identification, our approach employs a multi-expert collaborative system with inference-enhanced models to achieve superior noise detection. In the denoising phase, we utilize a context-enhanced strategy, which incorporates the most relevant and confident knowledge followed by careful assessment to generate reliable annotations. Additionally, we introduce an effective data selection mechanism based on response entropy, ensuring only high-quality samples are retained for fine-tuning. Extensive experiments conducted on multiple LLMs across five datasets demonstrate RobustFT's exceptional performance in noisy scenarios.

Community

Paper submitter

Hi there, today we introduce RobustFT, a noise-robust supervised fine-tuning framework designed to enhance the performance of LLMs in the presence of noisy training data. Supervised fine-tuning (SFT) is essential for adapting LLMs to specific domains, but noisy training data can significantly impact model performance. RobustFT addresses this challenge through:

  • Multi-expert collaborative noise detection
  • Context-enhanced relabeling strategy
  • Response entropy-based data selection

Our Code is available at https://github.com/luo-junyu/RobustFT

The framework:
image.png

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2412.14922 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2412.14922 in a Space README.md to link it from this page.

Collections including this paper 11