Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning

Kavli Affiliate: Xiang Zhang

| First 5 Authors: Weijia Li, Jinhua Yu, Dairong Chen, Yi Lin, Runmin Dong

| Summary:

In this work, we propose a geometry-aware semi-supervised method for
fine-grained building function recognition. This method leverages the geometric
relationships between multi-source data to improve the accuracy of pseudo
labels in semi-supervised learning, extending the task’s scope and making it
applicable to cross-categorization systems of building function recognition.
Firstly, we design an online semi-supervised pre-training stage, which
facilitates the precise acquisition of building facade location information in
street-view images. In the second stage, we propose a geometry-aware coarse
annotation generation module. This module effectively combines GIS data and
street-view data based on the geometric relationships, improving the accuracy
of pseudo annotations. In the third stage, we combine the newly generated
coarse annotations with the existing labeled dataset to achieve fine-grained
functional recognition of buildings across multiple cities at a large scale.
Extensive experiments demonstrate that our proposed framework exhibits superior
performance in fine-grained functional recognition of buildings. Within the
same categorization system, it achieves improvements of 7.6% and 4.8% compared
to fully-supervised methods and state-of-the-art semi-supervised methods,
respectively. Additionally, our method also performs well in cross-city tasks,
i.e., extending the model trained on OmniCity (New York) to new areas (i.e.,
Los Angeles and Boston). This study provides a novel solution for the
fine-grained function recognition of large-scale buildings across multiple
cities, offering essential data for understanding urban infrastructure
planning, human activity patterns, and the interactions between humans and
buildings.

| Search Query: ArXiv Query: search_query=au:”Xiang Zhang”&id_list=&start=0&max_results=3

Read More