Information Source Detection in the SIR Model: A Sample Path Based Approach (1206.5421v2)

Published 23 Jun 2012 in cs.SI and physics.soc-ph

Abstract: This paper studies the problem of detecting the information source in a network in which the spread of information follows the popular Susceptible-Infected-Recovered (SIR) model. We assume all nodes in the network are in the susceptible state initially except the information source which is in the infected state. Susceptible nodes may then be infected by infected nodes, and infected nodes may recover and will not be infected again after recovery. Given a snapshot of the network, from which we know all infected nodes but cannot distinguish susceptible nodes and recovered nodes, the problem is to find the information source based on the snapshot and the network topology. We develop a sample path based approach where the estimator of the information source is chosen to be the root node associated with the sample path that most likely leads to the observed snapshot. We prove for infinite-trees, the estimator is a node that minimizes the maximum distance to the infected nodes. A reverse-infection algorithm is proposed to find such an estimator in general graphs. We prove that for $g$-regular trees such that $gq>1,$ where $g$ is the node degree and $q$ is the infection probability, the estimator is within a constant distance from the actual source with a high probability, independent of the number of infected nodes and the time the snapshot is taken. Our simulation results show that for tree networks, the estimator produced by the reverse-infection algorithm is closer to the actual source than the one identified by the closeness centrality heuristic. We then further evaluate the performance of the reverse infection algorithm on several real world networks.

Citations (298)

View on Semantic Scholar

Summary

The paper introduces a sample path method to identify the origin of infections in SIR networks.
It employs a reverse infection algorithm using infection eccentricity and Jordan center concepts for source estimation.
Extensive simulations on tree structures and real-world networks show the approach outperforms traditional centrality measures.

Information Source Detection in the SIR Model: A Sample Path Based Approach

The paper "Information Source Detection in the SIR Model: A Sample Path Based Approach," authored by Kai Zhu and Lei Ying, addresses the problem of identifying the information source within networks modeled using the Susceptible-Infected-Recovered (SIR) paradigm. This paper develops a novel methodology to detect the origin of an information spread, which is particularly relevant in contexts such as tracing the first computer infected in a virus outbreak or identifying the initiator of rumors within social networks.

The research assumes a network where all nodes start in a susceptible state except for one node, termed as the information source, which is initially infected. As the process unfolds, infected nodes can transmit the infection to healthy nodes and recover afterward, rendering them immune. A fundamental challenge is posed by the inability to distinguish between susceptible and recovered nodes in the observed states of the network nodes, requiring only knowledge of which nodes are infected. Traditionally, the maximal likelihood estimation (MLE) would involve evaluating all possible infection propagation paths, which is computationally expensive and infeasible for large networks.

The authors introduce a sample path based methodology as a tractable alternative. Instead of relying on MLE directly—which becomes significantly cumbersome without pre-determined infection times—this technique involves determining the sample path most likely resulting in the observed infection snapshot and identifying the root of this path as the information source. Notably, properties like infection eccentricity and Jordan centers play significant roles in this formulation. In mathematical terms, the path whose root node exhibits minimum infection eccentricity is deemed optimal. The reverse infection algorithm is proposed to approximate this solution by running lightweight simulations to propagate messages among nodes, effectively identifying the Jordan infection center nodes.

Strong empirical findings corroborate this approach, especially for tree structured graphs. Notably, in g-regular trees, the algorithm positioned the derived estimator within a constant distance from the factual source with high likelihood, a result intriguingly independent of both the snapshot time and the count of infected nodes. Furthermore, the operational performance of the algorithm was validated nationally against traditional closeness centrality measures, whereby it consistently demonstrated superior efficacy in detecting the information source, thereby rendering it more reliable in practical deployment over complex network topologies.

A series of simulations conducted over various network types, including binomial random trees and real-world networks such as the Internet Autonomous Systems and Wikipedia's who-votes-on-whom network, affirm the robustness and consistency of this approach. The results indicate a definite improvement over baseline and competing heuristics, with the detection rate of the reverse infection algorithm generally outperforming random guessing and other models of centrality, typically achieving closer proximity to the actual source node.

From a theoretical perspective, the work advances the understanding of information diffusion processes, especially under the recovery-inclusive complexities of the SIR model. Practically, the algorithm's engagement in determining the information source can leverage improvements in areas such as cybersecurity (tracing malware origins) and misinformation management on digital platforms.

The paper concludes with an analysis of opportunities for algorithmic optimization and potential adaptation to more varied network configurations, paving the way for further research into adaptive mechanisms, and possible integrations with real-time data analytics streams, which could incredibly enhance predictive monitoring and controlling of information flows.

PDF Markdown

Related Papers

YouTube

Show All Videos