Attributed Network Embedding for Learning in a Dynamic Environment (1706.01860v2)

Published 6 Jun 2017 in cs.SI, cs.LG, and stat.ML

Abstract: Network embedding leverages the node proximity manifested to learn a low-dimensional node vector representation for each node in the network. The learned embeddings could advance various learning tasks such as node classification, network clustering, and link prediction. Most, if not all, of the existing works, are overwhelmingly performed in the context of plain and static networks. Nonetheless, in reality, network structure often evolves over time with addition/deletion of links and nodes. Also, a vast majority of real-world networks are associated with a rich set of node attributes, and their attribute values are also naturally changing, with the emerging of new content patterns and the fading of old content patterns. These changing characteristics motivate us to seek an effective embedding representation to capture network and attribute evolving patterns, which is of fundamental importance for learning in a dynamic environment. To our best knowledge, we are the first to tackle this problem with the following two challenges: (1) the inherently correlated network and node attributes could be noisy and incomplete, it necessitates a robust consensus representation to capture their individual properties and correlations; (2) the embedding learning needs to be performed in an online fashion to adapt to the changes accordingly. In this paper, we tackle this problem by proposing a novel dynamic attributed network embedding framework - DANE. In particular, DANE first provides an offline method for a consensus embedding and then leverages matrix perturbation theory to maintain the freshness of the end embedding results in an online manner. We perform extensive experiments on both synthetic and real attributed networks to corroborate the effectiveness and efficiency of the proposed framework.

Authors (6)

Jundong Li (126 papers)
Harsh Dani (1 paper)
Xia Hu (186 papers)
Jiliang Tang (204 papers)
Yi Chang (150 papers)
Huan Liu (283 papers)

Citations (352)

View on Semantic Scholar

Summary

Attributed Network Embedding for Learning in a Dynamic Environment: An Overview

This paper presents a novel approach to network embedding, addressing the dynamic nature of real-world attributed networks. The proposed framework, termed DANE (Dynamic Attributed Network Embedding), is designed to handle the complexities of network and attribute evolutions that occur over time. It provides a two-fold solution: an offline model for initial data processing and an online model for ongoing updates, leveraging matrix perturbation theory.

The primary contribution of this paper is its dynamic approach to network embedding. Unlike traditional methods that operate on static networks, DANE is constructed to adapt to continuous changes in both network topology and node attributes. This adaptability is particularly important given the real-world scenarios where networks and their attributes do not remain constant. For example, social networks where user connections and interactions evolve frequently, necessitating a network representation that can keep pace with these changes.

Offline Model

The offline component of DANE is designed to produce a consensus embedding representation, capturing the node proximity in both network structure and attribute space. This initial stage involves spectral embedding techniques to reduce noise in the data, which is formulated as a generalized eigenvalue problem. The embeddings for the network topology and node attributes are subsequently aligned via a correlation maximization process to ensure an effective consensus representation. This method focuses on preserving proximity structures and reducing the inherent noise that might otherwise distort the representation of the network's attributes and links.

Online Model

The highlight of the proposed method is the online model that efficiently updates its embeddings as networks and attributes change. This model uses matrix perturbation theory to update eigenvalues and eigenvectors, thus keeping the computational costs manageable compared to rerunning an offline embedding method iteratively. This component is crucial for maintaining the freshness and accuracy of the embedding in dynamic environments, addressing the inherent need for timely adaptation in the face of network changes.

Experimental Results

DANE's performance was evaluated on multiple datasets, including synthetic attributed networks and real-world networks like Epinions and DBLP. The results demonstrated its superior performance in terms of clustering and classification tasks compared to baseline methods such as Deepwalk, LINE, and others lacking dynamic adaptations. Notably, DANE showed significant time efficiency gains over these static approaches, strengthening its applicability to large-scale and rapidly changing networks.

Implications and Future Directions

The implications of this work are notable for fields relying heavily on network analysis, including social network analysis, recommendation systems, and bioinformatics, among others. By supporting dynamic updates, DANE addresses a significant gap in current network embedding techniques, which typically overlook the evolving nature of real networks.

Future directions might explore extending DANE's framework to multi-mode or multi-dimensional networks, further integrating deep learning techniques to enhance its capability in representing more complex relational networks. As networks continue to grow in size and complexity, such dynamic embedding methods will become increasingly critical for maintaining accurate and efficient network analyses.

In conclusion, DANE presents a significant step forward in dynamic network analytics, offering a robust, efficient solution to the evolving challenges of real-world networks. The integration of an online model for efficient updating positions it as a promising tool for continuous learning applications across a variety of domains.

PDF Markdown