Kavli Affiliate: Feng Wang
| First 5 Authors: Feng Wang, Huiyu Wang, Chen Wei, Alan Yuille, Wei Shen
| Summary:
Recent advances in self-supervised contrastive learning yield good
image-level representation, which favors classification tasks but usually
neglects pixel-level detailed information, leading to unsatisfactory transfer
performance to dense prediction tasks such as semantic segmentation. In this
work, we propose a pixel-wise contrastive learning method called CP2
(Copy-Paste Contrastive Pretraining), which facilitates both image- and
pixel-level representation learning and therefore is more suitable for
downstream dense prediction tasks. In detail, we copy-paste a random crop from
an image (the foreground) onto different background images and pretrain a
semantic segmentation model with the objective of 1) distinguishing the
foreground pixels from the background pixels, and 2) identifying the composed
images that share the same foreground.Experiments show the strong performance
of CP2 in downstream semantic segmentation: By finetuning CP2 pretrained models
on PASCAL VOC 2012, we obtain 78.6% mIoU with a ResNet-50 and 79.5% with a
ViT-S.
| Search Query: ArXiv Query: search_query=au:”Feng Wang”&id_list=&start=0&max_results=10