Crafting Better Contrastive Views for Siamese Representation Learning

Kavli Affiliate: Zheng Zhu

| First 5 Authors: Xiangyu Peng, Kai Wang, Zheng Zhu, Mang Wang, Yang You

| Summary:

Recent self-supervised contrastive learning methods greatly benefit from the
Siamese structure that aims at minimizing distances between positive pairs. For
high performance Siamese representation learning, one of the keys is to design
good contrastive pairs. Most previous works simply apply random sampling to
make different crops of the same image, which overlooks the semantic
information that may degrade the quality of views. In this work, we propose
ContrastiveCrop, which could effectively generate better crops for Siamese
representation learning. Firstly, a semantic-aware object localization strategy
is proposed within the training process in a fully unsupervised manner. This
guides us to generate contrastive views which could avoid most false positives
(i.e., object vs. background). Moreover, we empirically find that views with
similar appearances are trivial for the Siamese model training. Thus, a
center-suppressed sampling is further designed to enlarge the variance of
crops. Remarkably, our method takes a careful consideration of positive pairs
for contrastive learning with negligible extra training overhead. As a
plug-and-play and framework-agnostic module, ContrastiveCrop consistently
improves SimCLR, MoCo, BYOL, SimSiam by 0.4% ~ 2.0% classification accuracy on
CIFAR-10, CIFAR-100, Tiny ImageNet and STL-10. Superior results are also
achieved on downstream detection and segmentation tasks when pre-trained on
ImageNet-1K.

| Search Query: ArXiv Query: search_query=au:”Zheng Zhu”&id_list=&start=0&max_results=10

Read More

Leave a Reply