Network mutual information measures for graph similarity (2405.05177v2)

Published 8 May 2024 in physics.soc-ph and cs.SI

Abstract: A wide range of tasks in network analysis, such as clustering network populations or identifying anomalies in temporal graph streams, require a measure of the similarity between two graphs. To provide a meaningful data summary for downstream scientific analyses, the graph similarity measures used for these tasks must be principled, interpretable, and capable of distinguishing meaningful overlapping network structure from statistical noise at different scales of interest. Here we derive a family of graph mutual information measures that satisfy these criteria and are constructed using only fundamental information theoretic principles. Our measures capture the information shared among networks according to different encodings of their structural information, with our mesoscale mutual information measure allowing for network comparison under any specified network coarse-graining. We test our measures in a range of applications on real and synthetic network data, finding that they effectively highlight intuitive aspects of network similarity across scales in a variety of systems.

Citations (1)

View on Semantic Scholar

Summary

The paper introduces novel mutual information measures that evaluate graph similarity at micro, meso, and macro scales.
The methodology employs entropy calculations to quantify edge-level and degree-corrected neighborhood similarities, validated through simulations on synthetic networks.
Experiments highlight the robustness of MesoNMI in capturing community-level structures in real-world networks such as global trade and scientific collaborations.

Network Mutual Information Measures for Graph Similarity

The paper "Network Mutual Information Measures for Graph Similarity" by Helcio Felippe, Federico Battiston, and Alec Kirkley addresses a fundamental problem in network analysis: measuring the similarity between two graphs. The authors introduce a family of mutual information-based measures that evaluate graph similarity at multiple scales, providing enhanced interpretability and adaptability for various network analysis tasks.

Motivation and Challenges

Graph similarity measures are critical across disciplines for tasks such as clustering network populations, detecting anomalies, and assessing structural resemblances in complex networks. Existing measures often struggle to differentiate meaningful shared network structures from noise, particularly at different scales, such as micro (individual nodes), meso (community structures), or macro (global network properties). Traditional metrics, grounded in feature embeddings or network spectra, can be hard to interpret and do not consistently address statistical relevance or incorporate structural intricacies across scales.

Methodology

The authors propose mutual information measures based on fundamental information-theoretic principles. These measures evaluate the information shared between two graphs under different encoding schemes:

Network Mutual Information (NMI): Focuses on edge-level similarity.
Degree-Corrected Network Mutual Information (DC-NMI): Incorporates node degree distributions to assess neighborhood similarity.
Mesoscale Mutual Information (MesoNMI): Facilitates comparison at coarser scales, capturing community or group-level similarities in networks using a coarse-graining partition approach.

The measures are derived from calculating specific entropy values and exploring their reduction when comparing two graphs. For instance, the NMI and DC-NMI measures operate at finer structural resolutions, evaluating exact edge overlaps and degree-consistent neighborhood overlaps, respectively. MesoNMI focuses on more significant network divisions by observing similarities within and between predefined groups or communities.

Findings and Experiments

The researchers conducted extensive simulations on synthetic networks, including those generated by Erdős-Rényi, Barabási-Albert models, and stochastic block models, to validate the efficacy of these measures. Attacks were simulated by rewiring nodes and edges to observe the sensitivity of each measure to different types of structural perturbations. Notably, MesoNMI showed robustness to such perturbations when operating at larger scales, indicating its capacity to discern mesoscale similarities despite micro-level disruptions.

The measures were also applied to empirical datasets, such as global trade networks and scientific collaboration networks. For the former, MesoNMI accurately captured similarities aligned with trade patterns when grouping countries by geopolitical and economic indicators, exemplifying the measure's adaptability to real-world networks with multiscale structures.

Implications and Future Directions

This paper offers a principled framework for graph similarity using mutual information, emphasizing the versatility and interpretability necessary for diverse applications in network science. The proposed measures provide a toolkit for researchers to select or construct appropriate measures that consider the scale and type of similarity pertinent to their specific context.

Future studies could extend these measures to directed graphs explicitly, incorporate multigraph considerations directly within standard NMI frameworks, or adapt them to hypergraphs and other complex data structures that encapsulate higher-order interactions. Additionally, exploring dynamic network settings or real-time graph similarity assessments presents an intriguing avenue for further research.

In conclusion, this paper contributes significantly to the graph similarity literature by providing nuanced, scalable, and robust methods for evaluating network similarities across many contexts, addressing gaps in existing methodologies and paving the way for more refined analyses in network science.

PDF Markdown

Related Papers

GitHub

GitHub - hfelippe/network-MI (7 stars)

Tweets

https://twitter.com/juniorfelippe/status/1788531647886868877

https://twitter.com/net_science/status/1846483356575293531

https://twitter.com/net_science/status/1788607380634075284