Kavli Affiliate: Feng Wang
| First 5 Authors: Jiaxing Li, Chi Xu, Lianchen Jia, Feng Wang, Cong Zhang
| Summary:
Large Language Models are revolutionizing Web, mobile, and Web of Things
systems, driving intelligent and scalable solutions. However, as
Retrieval-Augmented Generation (RAG) systems expand, they encounter significant
challenges related to scalability, including increased delay and communication
overhead. To address these issues, we propose EACO-RAG, an edge-assisted
distributed RAG system that leverages adaptive knowledge updates and inter-node
collaboration. By distributing vector datasets across edge nodes and optimizing
retrieval processes, EACO-RAG significantly reduces delay and resource
consumption while enhancing response accuracy. The system employs a multi-armed
bandit framework with safe online Bayesian methods to balance performance and
cost. Extensive experimental evaluation demonstrates that EACO-RAG outperforms
traditional centralized RAG systems in both response time and resource
efficiency. EACO-RAG effectively reduces delay and resource expenditure to
levels comparable to, or even lower than, those of local RAG systems, while
significantly improving accuracy. This study presents the first systematic
exploration of edge-assisted distributed RAG architectures, providing a
scalable and cost-effective solution for large-scale distributed environments.
| Search Query: ArXiv Query: search_query=au:”Feng Wang”&id_list=&start=0&max_results=3