Robust Optimization in Influence Maximization

Updated 28 February 2026

Robust optimization and influence maximization is a framework that selects seed sets to maximize worst-case influence across various uncertain network models.
It replaces traditional expectation objectives with min-max or min-ratio formulations to hedge against unknown diffusion parameters, adversarial correlations, and evolving network topologies.
Algorithmic advances including bicriteria approaches, scenario sampling, and MWU techniques provide practical approximations despite the NP-hardness inherent in worst-case settings.

Robust optimization and influence maximization intersect in the study of selecting seed sets in networks so as to guarantee influence spread under various forms of uncertainty—such as unknown diffusion parameters, adversarial correlations, or evolving models. The robust optimization framework replaces the canonical expectation-maximization objective of classic influence maximization with a min-max or min-ratio objective that hedges against worst-case network, parameter, or model instantiations. This shift has driven substantial theoretical and algorithmic innovations, with both hardness results and a variety of tractable approximation schemes under specific structural assumptions.

1. Robust Influence Maximization: Formalizations and Objective

Robust influence maximization (RIM) generalizes classical influence maximization to settings where there is uncertainty about the influence function, transmission probabilities, or network topology. Let $G=(V,E)$ be a directed network, and consider a collection $\mathcal{F} = \{f_1,\dots,f_m\}$ of monotone, submodular influence functions, representing alternatives due to different topics, observed contexts, or diffusion models. The robust objective seeks a seed set $S\subseteq V$ of size at most $k$ maximizing the worst-case normalized influence:

$R(S) = \min_{i\in[m]} \frac{f_i(S)}{OPT_i}$

where $OPT_i = \max_{|T|\leq k} f_i(T)$ is the scenario-specific optimum. This model captures adversarial choice: after selecting $S$ , the worst influence function is selected against $S$ (He et al., 2016).

Alternatively, under parameter uncertainty, with influence probabilities $p_e$ unknown but constrained to intervals $[l_e, r_e]$ , the robust ratio is

$g(\Theta, S) = \min_{\theta \in \Theta} \frac{\sigma_\theta(S)}{\sigma_\theta(S_\theta^*)}$

where $\Theta = \prod_e [l_e, r_e]$ and $S_\theta^*$ is the optimal seed set for fixed $\theta$ (Chen et al., 2016).

Hyperparametric robust IM models the edge probabilities as functions $p_e(\theta) = H(\theta, x_e)$ of low-dimensional hyperparameters $\theta$ and edge features, leading to objectives of the form $\max_{|S|\leq k} \min_{\theta \in \Theta} f_\theta(S)$ (Kalimeris et al., 2019, Saha et al., 2024).

Distributionally robust formulations, such as correlation-robust IM, maximize minimum expected spread over all joint distributions on edge live/dead states consistent with prescribed marginals, capturing model uncertainty in correlations rather than just parameters (Chen et al., 2020).

2. Complexity and Hardness: Theoretical Limits

Robust influence maximization is frequently intractable. Even for a single influence function (the classic problem), maximizing expected spread is NP-hard. For $m>1$ functions, $R(S)$ is a minimum over $m$ monotone submodular functions and is no longer submodular. It is NP-hard to find, for constants $\delta, \varepsilon > 0$ , a seed set of size at most $(1-\delta) \ln m \cdot k$ with $R(S) \geq 1/n^{1 - \varepsilon}$ (He et al., 2016). This bicriteria hardness result is based on a reduction from gap-Set-Cover and demonstrates that, unless a logarithmic factor increase in seed set cardinality is allowed, no polynomial-time algorithm can obtain nontrivial robust guarantees.

Analogous intractability is present for interval-uncertainty models. For any $\varepsilon>0$ , RIM is NP-hard to approximate within $1-1/e+\varepsilon$ (Chen et al., 2016). There exist instances where maximal robust ratios degrade to $O(k/n)$ , and randomization provides only a $O((\log n)/\sqrt{n})$ improvement at best.

Under hyperparametric uncertainty, the robust max-min objective is again NP-hard to approximate within anything better than $O(1/n^{1-\varepsilon})$ using only $k$ seeds, unless excessive bi-criteria relaxation is permitted (Kalimeris et al., 2019). Correlation-robust IM is shown to be NP-hard by reduction from Max-Cover (Chen et al., 2020).

3. Approximation Algorithms and Sampling Schemes

Despite worst-case intractability, algorithmic progress is achieved via bicriteria and scenario-aggregation approaches.

Bicriteria Algorithms

If allowed $O(k \log m)$ seeds, a "Saturate Greedy" algorithm recovers a $(1-1/e)$ -approximation to the robust objective $R(S)$ : For a candidate value $c$ , define $h^{(c)}_i(S) = \min(c, f_i(S)/OPT_i)$ and aggregate $H^{(c)}(S) = \sum_i h^{(c)}_i(S)$ . A submodular cover algorithm (greedy) identifies $S$ of size $O(k \log m)$ achieving $H^{(c)}(S)\geq m c - \varepsilon$ , enabling a binary search to maximize feasible $c$ (He et al., 2016).

Under parameter-interval uncertainty, a solution-dependent lower-upper greedy algorithm selects seeds using the upper and lower bounds on $p_e$ , yielding a worst-case ratio of at least $\alpha(\Theta)(1-1/e)$ , where $\alpha(\Theta)$ is the ratio of spreads achieved under extreme parameter settings (Chen et al., 2016).

Scenario-Sampling and MWU

For hyperparametric robust IM (and its dynamic variants), the parameter space is discretized via random sampling (covering), and multiplicative weight updates (MWU) are applied to hedge across "scenarios." The HIRO algorithm iteratively computes (via greedy submodular maximization) the best response to a weighted mixture of sampled hyperparameters, with the union of $T = \widetilde{O}(\log n / \varepsilon^2)$ instantiations yielding a $(1-1/e - o(1))$ approximation with $O(\log n)$ bi-criteria relaxation (Kalimeris et al., 2019). This approach extends to fully dynamic networks in the RIME algorithm, where fast incremental greedy coverage updates maintain a near-optimal robust solution with provable guarantees under edge/node insertions and deletions (Saha et al., 2024).

Distributionally Robust and Submodular Methods

Correlation-robust influence maximization is solved by reformulating the worst-case expected influence as a polynomial-sized linear program, leveraging submodularity preservation under pointwise minimum. The classical submodular greedy algorithm again yields a $(1-1/e)$ approximation (Chen et al., 2020).

In continuous-budget bipartite settings, Staib and Jegelka demonstrate that the robust budget allocation saddle-point problem admits an exact polynomial-time solution by exploiting continuous submodularity in the adversary’s parameter minimization, discretization to the integer lattice, and convex surrogates over isotonic cones (Staib et al., 2017).

Network Inference from Cascades

Robust influence maximization with unknown graph and parameters (network inference from samples) can be addressed by learning edge activation probabilities through empirical one-step activations and then applying any constant-factor IM algorithm to the estimated network. Even with weak moment and seed-distribution assumptions, this approach achieves constant-factor robust guarantees for all $p$ within the estimated confidence set (Zhang et al., 2021).

4. Empirical Performance and Practical Considerations

Empirical results consistently indicate that robust IM algorithms perform well, often matching or nearly matching classical heuristics when underlying uncertainty is moderate and the scenario space is not adversarial (He et al., 2016, Chen et al., 2016, Kalimeris et al., 2019, Saha et al., 2024). For example, with only 1.5x the seed set size, worst-case objective values achieved by all robust and heuristic algorithms approach 1.0 in real-world datasets (He et al., 2016). Adaptive, cascade-based sampling efficiently reduces uncertainty, with orders of magnitude improved sample-efficiency over uniform sampling (Chen et al., 2016).

In correlation-robust models, the price of ignoring worst-case correlations is potentially catastrophic in pathological instances, but in standard networks IC-based seeds perform nearly as well, and correlation-robust seeds lose only a small fraction under the classical model (Chen et al., 2020).

In dynamic hyperparametric models, algorithms like RIME achieve substantial computational savings and maintain spread guarantees even under rapid network evolution, scaling to networks with $n\approx10^5$ nodes—far beyond what recomputation-based approaches can manage (Saha et al., 2024).

5. Model Structures, Uncertainty Sets, and Robustness Dimensions

Different works exploit structural assumptions to mitigate worst-case hardness:

Finite scenario lists: Robust IM treats different parameterizations as a finite set of monotone submodular functions, suitable for scenario aggregation and submodular cover (He et al., 2016).
Parametric uncertainty sets: Intervals, ellipsoids, and hyperparametric models make the uncertainty geometry more tractable for sampling-based coverage, MWU, or optimization-based solvers (Chen et al., 2016, Kalimeris et al., 2019, Saha et al., 2024).
Distributionally robust sets: Joint distribution uncertainty respecting marginals, as in correlation-robust IM, can be exploited using LP/shortest-path characterizations (Chen et al., 2020).
Unknown, sampled network: Robustness is realized via concentration inequalities and plug-in estimators applied to cascade-based empirical likelihoods (Zhang et al., 2021).

The choice of uncertainty set and model dimension directly impacts both tractability and achievable approximation factor. Low-dimensional hyperparametric models admit covering-based reductions, while unconstrained uncertainty (general interval sets or adversarial scenarios) generally allow only bicriteria guarantees with significant seed-set blowup.

6. Extensions and Open Research Directions

Key open directions include:

Active hyperparameter sampling: Adaptive selection of scenario points in hyperparametric robust IM to improve sample and computational efficiency (Kalimeris et al., 2019, Saha et al., 2024).
Integration of learning and robustness: Combining data-driven hyperparameter estimation with robust optimization to manage residual uncertainty.
Beyond generalized linear models: Extending to complex, nonlinear, or non-Lipschitz diffusion probability models, handling richer feature dependencies (Kalimeris et al., 2019).
Correlation-aware and dependency-robust extensions: Partial independence, blendings of correlation-robust and classical IC settings, and adaptive robustification to observed dependencies (Chen et al., 2020).
Sample complexity and observation regime optimization: Reducing sample requirements for robust network inference and influence maximization from cascade data, especially in partially observed environments (Zhang et al., 2021).
Adaptivity and scalability in dynamic networks: Maintaining robust, near-optimal seeds efficiently under continual graph modifications, and developing strong dynamic regret bounds (Saha et al., 2024).

7. Relationship to Classical Influence Maximization and Broader Impact

Robust optimization radically generalizes classical influence maximization by ensuring performance across multiple plausible models, parameter settings, or adversarially selected cases. The core theoretical barrier—a switch from submodular maximization to min-max or ratio objectives—changes the complexity landscape, yet prompts new algorithmic frameworks such as MWU, sample-covering, and discrete bi-criteria maximization. This approach provides critical protection against model misspecification, estimation error, and unknown correlations—phenomena pervasive in practical diffusion systems—enabling influence maximization algorithms to transition from results that are valid in expectation or under point estimates to those that offer performance guarantees in the presence of deep structural and statistical uncertainty (He et al., 2016, Chen et al., 2016, Kalimeris et al., 2019, Chen et al., 2020, Staib et al., 2017, Saha et al., 2024, Zhang et al., 2021).