A Survey on Influence Maximization in a Social Network (1808.05502v1)

Published 16 Aug 2018 in cs.SI

Abstract: Given a social network with diffusion probabilities as edge weights and an integer k, which k nodes should be chosen for initial injection of information to maximize influence in the network? This problem is known as Target Set Selection in a social network (TSS Problem) and more popularly, Social Influence Maximization Problem (SIM Problem). This is an active area of research in computational social network analysis domain since one and half decades or so. Due to its practical importance in various domains, such as viral marketing, target advertisement, personalized recommendation, the problem has been studied in different variants, and different solution methodologies have been proposed over the years. Hence, there is a need for an organized and comprehensive review on this topic. This paper presents a survey on the progress in and around TSS Problem. At last, it discusses current research trends and future research directions as well.

Citations (167)

View on Semantic Scholar

Summary

The paper provides a comprehensive survey of the influence maximization problem in social networks, detailing its variants, computational complexity, and solution methodologies.
The problem is shown to be NP-Hard under common diffusion models like IC and LT, necessitating the use of approximate or heuristic algorithms.
Solution approaches range from approximation algorithms offering guarantees to scalable heuristics and sampling-based methods addressing computational challenges in large networks.

Insights on Influence Maximization in Social Networks

This survey paper by Suman Banerjee et al. provides a comprehensive review of the Social Influence Maximization (SIM) problem, a pivotal challenge in computational social network analysis. The authors delve into the Target Set Selection (TSS) problem, which seeks to determine the optimal set of nodes for initial information dissemination to maximize influence within a network, a topic that has gained substantial attention due to its broad applicability across domains such as viral marketing, personalized recommendation, and target advertisement.

Overview of Variants and Computational Complexity

The paper systematically explores various formulations of the SIM problem beyond the basic model. It presents a taxonomy that includes the Top k-node problem, Influence Spectrum, λ Coverage, Weighted TSS, r-round min-TSS, Budgeted Influence Maximization, and variant settings involving parameters like diffusion rounds and influential node sets. These variants emerge from practical considerations such as budget constraints, timeline for influence spread, and target audience specificity, reflecting the complexity and multifaceted nature of real-world social networks.

On the complexity front, the SIM problem is shown to be NP-Hard under common probabilistic diffusion models, the Independent Cascade (IC) and Linear Threshold (LT) models, indicating its challenging nature in providing exact solutions within polynomial time constraints. The paper underscores its inapproximability within certain bounds, highlighting the necessity of approximate or heuristic strategies in practical applications. The authors also extend this complexity analysis to parameterized settings, revealing how network structural parameters can impact tractability and solution approaches.

Solution Methodologies

The survey categorizes solution methodologies into heuristic approaches, approximation algorithms with provable guarantees, metaheuristic solutions, community-based methods, and miscellaneous strategies. Notably, while Kempe et al.'s foundational greedy algorithm offers a submodular approximation guarantee, it is computationally intensive, driving subsequent research toward enhancing scalability through techniques like CELF, CELF++, Static Greedy, and more recent innovations employing reverse reachable sampling such as the TIM, IMM, and Stop-and-Stare algorithms.

Heuristic solutions like Degree Discount and PAGE RANK exploit topological network features for efficient seed selection but often lack formal guarantees on influence spread. The paper also examines metaheuristic applications, including genetic algorithms and discrete particle swarm optimization, which introduce adaptive search mechanisms to tackle the problem iteratively. Moreover, community-based approaches leverage inherent network structures to localize seed selection, further addressing scalability challenges and improving influence spread efficiency.

Future Directions and Theoretical Implications

The paper identifies several avenues for future research, particularly emphasizing the exploration of dynamic and topic-aware social networks, where temporal and semantic facets of information dissemination necessitate novel algorithmic strategies. Incorporating parameters like cost, benefit, and influence timeframe offers potential for more realistic modeling, especially in economic and strategic applications like viral marketing.

Practically, enhancing algorithm scalability remains a paramount objective, with potential advancements in parallel processing and distributed computing frameworks poised to address the growing size and complexity of social network data. The authors advocate for empirical benchmarking to demystify algorithmic efficacy, a critical step to reconcile theoretical findings with real-world performance.

The paper provides a foundational reference point for understanding the SIM problem's existing landscape and its future trajectory, emphasizing scalability as an ongoing research thrust. Through its systematic survey and insightful critique, the paper serves as a compelling guide for researchers and domain experts aiming to harness influence maximization within the expanding tapestry of social network analysis.