Kavli Affiliate: Gang Su
| First 5 Authors: Huifeng Lin, Huifeng Lin, , ,
| Summary:
Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs) is
a powerful solution to understand and query the industry’s closed-source
documents. However, basic RAG often struggles with complex QA tasks in legal
and regulatory domains, particularly when dealing with numerous government
documents. The top-$k$ strategy frequently misses golden chunks, leading to
incomplete or inaccurate answers. To address these retrieval bottlenecks, we
explore two strategies to improve evidence coverage and answer quality. The
first is a One-SHOT retrieval method that adaptively selects chunks based on a
token budget, allowing as much relevant content as possible to be included
within the model’s context window. Additionally, we design modules to further
filter and refine the chunks. The second is an iterative retrieval strategy
built on a Reasoning Agentic RAG framework, where a reasoning LLM dynamically
issues search queries, evaluates retrieved results, and progressively refines
the context over multiple turns. We identify query drift and retrieval laziness
issues and further design two modules to tackle them. Through extensive
experiments on a dataset of government documents, we aim to offer practical
insights and guidance for real-world applications in legal and regulatory
domains.
| Search Query: ArXiv Query: search_query=au:”Gang Su”&id_list=&start=0&max_results=3