Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering

Kavli Affiliate: Ran Wang

| First 5 Authors: Xu Zhang, Yuheng Jia, Mofei Song, Ran Wang,

| Summary:

Ensemble clustering aggregates multiple weak clusterings to achieve a more
accurate and robust consensus result. The Co-Association matrix (CA matrix)
based method is the mainstream ensemble clustering approach that constructs the
similarity relationships between sample pairs according the weak clustering
partitions to generate the final clustering result. However, the existing
methods neglect that the quality of cluster is related to its size, i.e., a
cluster with smaller size tends to higher accuracy. Moreover, they also do not
consider the valuable dissimilarity information in the base clusterings which
can reflect the varying importance of sample pairs that are completely
disconnected. To this end, we propose the Similarity and Dissimilarity Guided
Co-association matrix (SDGCA) to achieve ensemble clustering. First, we
introduce normalized ensemble entropy to estimate the quality of each cluster,
and construct a similarity matrix based on this estimation. Then, we employ the
random walk to explore high-order proximity of base clusterings to construct a
dissimilarity matrix. Finally, the adversarial relationship between the
similarity matrix and the dissimilarity matrix is utilized to construct a
promoted CA matrix for ensemble clustering. We compared our method with 13
state-of-the-art methods across 12 datasets, and the results demonstrated the
superiority clustering ability and robustness of the proposed approach. The
code is available at https://github.com/xuz2019/SDGCA.

| Search Query: ArXiv Query: search_query=au:”Ran Wang”&id_list=&start=0&max_results=3

Read More