arxiv:2410.06949

Seeker: Enhancing Exception Handling in Code with LLM-based Multi-Agent Approach

Published on Oct 9

· Submitted by

XUANMINGZHANG on Oct 10

Upvote

Authors:

Xuanming Zhang ,

Abstract

In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code. Exception handling mechanisms require developers to detect, capture, and manage exceptions according to high standards, but many developers struggle with these tasks, leading to fragile code. This problem is particularly evident in open source projects and impacts the overall quality of the software ecosystem. To address this challenge, we explore the use of large language models (LLMs) to improve exception handling in code. Through extensive analysis, we identify three key issues: Insensitive Detection of Fragile Code, Inaccurate Capture of Exception Types, and Distorted Handling Solutions. These problems are widespread across real world repositories, suggesting that robust exception handling practices are often overlooked or mishandled. In response, we propose Seeker, a multi agent framework inspired by expert developer strategies for exception handling. Seeker uses agents: Scanner, Detector, Predator, Ranker, and Handler to assist LLMs in detecting, capturing, and resolving exceptions more effectively. Our work is the first systematic study on leveraging LLMs to enhance exception handling practices, providing valuable insights for future improvements in code reliability.

View arXiv page View PDF Add to collection

Community

XUANMINGZHANG

Paper author Paper submitter 29 days ago

As the functional correctness of large language models (LLMs) in code generation tasks continues to gain attention and improve, how to generate code that passes more test cases and how to reduce functional vulnerabilities in the code have become determinative to evaluate LLMs' coding performance. However, few works delve into the performance of LLMs in code robustness represented by exception handling mechanisms. Especially in real development scenarios, the exception mechanism has high standardization requirements for developers in terms of detecting, capturing and handling. Due to the lack of interpretable experience and generalizable strategies, highly robust code is relatively scarce in the main peak of open source projects, which further affects the high-quality code training dataset and generation quality of LLM. This prompted us to raise a research question that few people have explored: "Do we need to enhance the standardization, interpretability and generalizability of exception handling in real code development scenarios?" To confirm this requirement, we first uncovered three pillar phenomena of incorrect exception handling through extensive LLM and human code review: Insensitive-Detection of Fragile Code, Inaccurate-Capture of Exception Type, and Distorted-Solution of Handling Block. These phenomena frequently occur both in real repositories and generated code, indicating that both human developers and LLMs are hard to spontaneously understand the usage skills of exception handling. Surprisingly, LLM's bad performance will have a good mitigation effect under the enhanced prompts of precise exception types, scenario logic and handling strategy. To exploit this effect, we propose a new method called Seeker, it is a chain of agents based on thoughts of the most experienced human developers when facing exception handling tasks, separated as Scanner, Detector, Predator, Ranker, Handler agents. To the best of our knowledge, our work is the first systematic study of the robustness of LLM-generated code in real development scenarios, providing valuable insights for future research in the direction of reliable code generation.