Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Graph Clustering via Dual Correlation Reduction (2112.14772v1)

Published 29 Dec 2021 in cs.LG, cs.AI, and cs.CV

Abstract: Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years. However, we observe that, in the process of node encoding, existing methods suffer from representation collapse which tends to map all data into the same representation. Consequently, the discriminative capability of the node representation is limited, leading to unsatisfied clustering performance. To address this issue, we propose a novel self-supervised deep graph clustering method termed Dual Correlation Reduction Network (DCRN) by reducing information correlation in a dual manner. Specifically, in our method, we first design a siamese network to encode samples. Then by forcing the cross-view sample correlation matrix and cross-view feature correlation matrix to approximate two identity matrices, respectively, we reduce the information correlation in the dual-level, thus improving the discriminative capability of the resulting features. Moreover, in order to alleviate representation collapse caused by over-smoothing in GCN, we introduce a propagation regularization term to enable the network to gain long-distance information with the shallow network structure. Extensive experimental results on six benchmark datasets demonstrate the effectiveness of the proposed DCRN against the existing state-of-the-art methods.

Citations (173)

Summary

  • The paper introduces DCRN, a self-supervised network that reduces sample and feature correlations to counteract representation collapse in graph clustering.
  • It leverages a Siamese architecture enforcing identity matrices on cross-view correlation, enhancing discriminative feature learning.
  • Extensive experiments on benchmark datasets show that DCRN outperforms models like SDCN and DFCN in clustering accuracy and robustness.

Overview of "Deep Graph Clustering via Dual Correlation Reduction"

The paper "Deep Graph Clustering via Dual Correlation Reduction" introduces the Dual Correlation Reduction Network (DCRN) as a self-supervised method designed to enhance the performance of deep graph clustering. The authors identify a key issue in existing graph convolutional network (GCN)-based clustering methods known as representation collapse, where node encoding tends to converge to similar representations irrespective of their category distinctions. This diminishes the discriminative capability of node representations, ultimately hindering clustering performance.

To counteract this problem, the authors leverage a novel approach involving the reduction of information correlation at both sample and feature levels. Specifically, DCRN employs a dual correlation reduction strategy. Utilizing a Siamese network, the technique forces cross-view sample correlation matrices and cross-view feature correlation matrices to approximate identity matrices, thus mitigating information correlation. This method enhances the discriminative capacity of features derived from the network.

In addition, DCRN addresses over-smoothing, a common barrier in GCNs through the introduction of a propagation regularization term. This term facilitates the acquisition of long-distance information within a shallow network structure, improving overall clustering performance.

Numerical Results and Implications

The authors provide extensive experimental results on six benchmark datasets—DBLP, CITE, ACM, AMAP, PUBMED, and CORAFULL—demonstrating the effectiveness of DCRN against state-of-the-art methods. Notably, DCRN consistently outperforms its counterparts such as SDCN, DFCN, and MVGRL across several metrics, including accuracy (ACC), normalized mutual information (NMI), average Rand index (ARI), and macro F1-score (F1).

The substantial empirical results suggest that the dual correlation reduction strategy is pivotal in segregating complex node interactions, providing distinct feature representations essential for effective clustering. Such enhancements have practical implications for applications within social networks and recommendation systems where accurate clustering transforms user engagement and personalized experience.

Future Directions and Theoretical Implications

The mechanism devised in DCRN to prevent representation collapse may pave the way for future developments in unsupervised learning techniques, extending applicability across diverse domains reliant on clustering methods. The dual correlation reduction framework could potentially be adapted or integrated into other machine learning models requiring robust feature discrimination and efficient representation learning.

Theoretically, the framework sets precedence for investigating correlation reduction methodologies beyond traditional self-supervised learning paradigms. Future research could explore hybrid approaches where DCRN is complemented with external data sources or advanced learning paradigms to improve representation fidelity further.

In summary, the paper presents a sophisticated approach of reducing correlated redundancy within node representations, unlocking enhanced performance in graph clustering tasks. DCRN offers a practical solution to the prevalent issue of representation collapse in GCNs, thereby contributing significantly to the landscape of deep clustering methodologies.