Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Abstract
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data. To improve models beyond the training data, recent works have explored how LLMs can be used to generate synthetic data for autonomous self-improvement. However, successive steps of self-improvement can reach a point of diminishing returns. In this work, we propose a complementary approach towards self-improvement where finetuning is applied to a multiagent society of language models. A group of language models, all starting from the same base model, are independently specialized by updating each one using data generated through multiagent interactions among the models. By training each model on independent sets of data, we illustrate how this approach enables specialization across models and diversification over the set of models. As a result, our overall system is able to preserve diverse reasoning chains and autonomously improve over many more rounds of fine-tuning than single-agent self-improvement methods. We quantitatively illustrate the efficacy of the approach across a wide suite of reasoning tasks.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning (2024)
- Language Models as Continuous Self-Evolving Data Engineers (2024)
- Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models (2024)
- MALT: Improving Reasoning with Multi-Agent LLM Training (2024)
- Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering (2024)
- MATATA: A weakly-supervised MAthematical Tool-Assisted reasoning for Tabular Applications (2024)
- Enhancing Reasoning through Process Supervision with Monte Carlo Tree Search (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper