AutoPrune: Each Complexity Deserves a Pruning Policy – Kavli Institute Pre-Print Publications

Kavli Affiliate: Ke Wang

| First 5 Authors: Hanshi Wang, Hanshi Wang, , ,

| Summary:

The established redundancy in visual tokens within large vision-language
models allows pruning to effectively reduce their substantial computational
demands. Previous methods typically employ heuristic layer-specific pruning
strategies where, although the number of tokens removed may differ across
decoder layers, the overall pruning schedule is fixed and applied uniformly to
all input samples and tasks, failing to align token elimination with the
model’s holistic reasoning trajectory. Cognitive science indicates that human
visual processing often begins with broad exploration to accumulate evidence
before narrowing focus as the target becomes distinct. Our experiments reveal
an analogous pattern in these models. This observation suggests that neither a
fixed pruning schedule nor a heuristic layer-wise strategy can optimally
accommodate the diverse complexities inherent in different inputs. To overcome
this limitation, we introduce Complexity-Adaptive Pruning (AutoPrune), a
training-free, plug-and-play framework that tailors pruning policies to varying
sample and task complexities. Specifically, AutoPrune quantifies the mutual
information between visual and textual tokens, then projects this signal to a
budget-constrained logistic retention curve. Each such logistic curve, defined
by its unique shape, corresponds to the specific complexity of different tasks
and can guarantee adherence to predefined computational constraints. We
evaluate AutoPrune on standard vision-language tasks and on
Vision-Language-Action models for autonomous driving. Notably, when applied to
LLaVA-1.5-7B, our method prunes 89% of visual tokens and reduces inference
FLOPs by 76.8% while retaining 96.7% of the original accuracy averaged over all
tasks. This corresponds to a 9.1% improvement over the recent work PDrop,
demonstrating the effectiveness. Code is available at
https://github.com/AutoLab-SAI-SJTU/AutoPrune.

| Search Query: ArXiv Query: search_query=au:”Ke Wang”&id_list=&start=0&max_results=3