Kavli Affiliate: Jing Wang | First 5 Authors: Hui Wang, Zhengpeng Zhao, Jing Wang, Yushu Du, Yuan Cheng | Summary: Deep Neural Networks are increasingly leveraging sparsity to reduce the scaling up of model parameter size. However, reducing wall-clock time through sparsity and pruning remains challenging due to irregular memory access patterns, leading to frequent […]
Continue.. NVR: Vector Runahead on NPUs for Sparse Memory Access