Kavli Affiliate: Ran Wang | First 5 Authors: Bin Chen, Ran Wang, Di Ming, Xin Feng, | Summary: Recent advances of Transformers have brought new trust to computer vision tasks. However, on small dataset, Transformers is hard to train and has lower performance than convolutional neural networks. We make vision transformers as data-efficient as convolutional […]
Continue.. ViT-P: Rethinking Data-efficient Vision Transformers from Locality