Kavli Affiliate: Li Xin Li
| First 5 Authors: Bingchen Li, Xin Li, , ,
| Summary:
Existing Image Restoration (IR) studies typically focus on task-specific or
universal modes individually, relying on the mode selection of users and
lacking the cooperation between multiple task-specific/universal restoration
modes. This leads to insufficient interaction for unprofessional users and
limits their restoration capability for complicated real-world applications. In
this work, we present HybridAgent, intending to incorporate multiple
restoration modes into a unified image restoration model and achieve
intelligent and efficient user interaction through our proposed hybrid agents.
Concretely, we propose the hybrid rule of fast, slow, and feedback restoration
agents. Here, the slow restoration agent optimizes the powerful multimodal
large language model (MLLM) with our proposed instruction-tuning dataset to
identify degradations within images with ambiguous user prompts and invokes
proper restoration tools accordingly. The fast restoration agent is designed
based on a lightweight large language model (LLM) via in-context learning to
understand the user prompts with simple and clear requirements, which can
obviate the unnecessary time/resource costs of MLLM. Moreover, we introduce the
mixed distortion removal mode for our HybridAgents, which is crucial but not
concerned in previous agent-based works. It can effectively prevent the error
propagation of step-by-step image restoration and largely improve the
efficiency of the agent system. We validate the effectiveness of HybridAgent
with both synthetic and real-world IR tasks.
| Search Query: ArXiv Query: search_query=au:”Li Xin Li”&id_list=&start=0&max_results=3