MCMI: Minimum Cost Maximum Influence

Updated 19 April 2026

MCMI is an optimization paradigm that selects a minimum-cost set of seed nodes to maximize influence spread using models like Independent Cascade and Linear Threshold.
It employs exact methods, greedy heuristics, and evolutionary algorithms to balance cost, influence, and time constraints in network diffusion.
Practical applications include viral marketing, rumor containment, and graph-based reasoning, with research emphasizing scalability and multilayer integration.

The Minimum Cost Maximum Influence (MCMI) problem is a foundational optimization paradigm in social network analysis and information diffusion. It seeks to select a set of initial nodes (seeds) with minimal total cost such that the resulting influence spread—the expected number of activated nodes or groups, under a stochastic diffusion model—is maximized, often under additional budget, connectivity, or temporal constraints. MCMI formulations and algorithms are core to applications in viral marketing, information campaigns, rumor containment, and modern graph-based reasoning in machine learning. A broad array of mathematical models, complexity results, algorithmic frameworks (exact, approximate, and heuristic), and empirical methodologies have been developed to address MCMI in single-layer, multilayer, and multiplex networks.

1. Formal Models and Mathematical Formulations

MCMI variants share the essential structure of maximizing diffusion subject to a resource bound:

$\max_{S\subseteq V} \quad \sigma(S) \quad \text{s.t.} \quad C(S) = \sum_{v\in S} c_v \le B$

where $G=(V,E)$ is a directed or undirected social network, $\sigma(S)$ is the influence spread (expected number of activated nodes under a model such as Independent Cascade (IC) or Linear Threshold (LT)), $c_v\geq 0$ is the activation cost of node $v$ , and $B$ is the budget (Zhang et al., 2024, Zhang et al., 2016).

Generalizations include:

Tri-objective MCMI: Simultaneously optimize influence, cost, and propagation time.

$\max_{S} (\sigma(S), -C(S), -T(S))$

with $T(S)$ the (random) completion time to reach full diffusion (Feng et al., 9 Sep 2025).

Multilayer MCMI: Influence spreads in $M$ layers $G^m$ over the same node set, objective is

$G=(V,E)$ 0

with global budget and seed-size (capacity) constraints (Zhang et al., 2024).

Minimum-seed (Least Cost Influence): Minimize $G=(V,E)$ 1 subject to coverage constraint $G=(V,E)$ 2 or $G=(V,E)$ 3 (Zhang et al., 2016, Chen et al., 2022).
Group Influence with Minimum Cost: Seeds activate groups, each requiring a threshold total influence from activated members; coverage over groups, not just individuals (Pham et al., 2021).
Cost-aware subgraph retrieval: Select a connected subgraph with terminals $G=(V,E)$ 4 to maximize aggregate influence score over incurred edge costs,

$G=(V,E)$ 5

for subgraph $G=(V,E)$ 6, with $G=(V,E)$ 7 and connectivity (Wang et al., 2 Nov 2025).

2. Complexity and Polyhedral Results

MCMI and its central variants are NP-hard. The hardness is established via reductions from classical Set Cover (single-layer IM), Steiner Tree (cost-aware connected subgraphs), and Knapsack (budgeted optimization) (Wang et al., 2 Nov 2025, Feng et al., 9 Sep 2025, Chen et al., 2022).

Key complexity aspects:

**Influence maximization ( $G=(V,E)$ 8) under IC/LT is NP-hard and $G=(V,E)$ 9P-hard to exactly compute $\sigma(S)$ 0 (Zhang et al., 2024, Chen et al., 2022).
Multi-objective (influence, cost, time): The problem remains NP-hard as non-submodular objectives (e.g., time-to-activation, or group coverage) rule out classical greedy guarantees (Feng et al., 9 Sep 2025).
Polyhedral structure: In the cost-minimization view, tight mixed-integer program (MIP) formulations are enhanced with continuous-cover, packing, and minimum-influencing-subset inequalities, all of which give facet-defining constraints for the convex hull of feasible solutions. Efficient separation algorithms and dynamic programming are available for cycles and trees, enabling provably optimal solutions for certain topologies (Chen et al., 2022).

3. Algorithmic Approaches

Exact and Polyhedral Methods

For moderate-scale instances, exact MIP and delayed cut generation (branch-and-cut) approaches are highly effective, especially when strengthened with polyhedral inequalities that reduce integrality gaps (Chen et al., 2022). These include:

Continuous–cover and packing cuts: Derived from knapsack cover substructures.
Minimum-influencing-subset cuts: Captures tight infeasibility for y_j coverage at node j.
Cycle-elimination and (U,C) inequalities: Ensure acyclicity in influence flows.

Dynamic programming solves special cases (cycles, trees) in $\sigma(S)$ 1 time; for general graphs, the cut-based MIP approach solves almost all small/medium instances to optimality in seconds or minutes (Chen et al., 2022).

Greedy and Approximate Methods

Classical greedy algorithms are applicable when influence spread $\sigma(S)$ 2 is monotone and submodular (e.g., IC, LT models). Under these conditions, a greedy scheme yields a $\sigma(S)$ 3 approximation (Zhang et al., 2024, Feng et al., 9 Sep 2025, Cao et al., 12 Mar 2026).

Submodular greedy (e.g. Algorithm 1 in (Cao et al., 12 Mar 2026)): Iteratively select argmax-marginal $\sigma(S)$ 4 while respecting the budget.
Lazy greedy: Maintains a heap of marginal gains, updating only when necessary, resulting in significant speedups (Zhang et al., 2016).
Surrogate greedy for steady-state causal objectives: Uses simulation and shape-constrained learning to optimize welfare under general interference with provable $\sigma(S)$ 5 guarantees, modulo estimation and structural bias (Cao et al., 12 Mar 2026).

Metaheuristics and Evolutionary Algorithms

Heuristic and metaheuristic approaches enable scalable optimization when submodularity fails or multiple objectives/constraints preclude greedy guarantees.

Multilayer multi-population genetic algorithm (MMGA): Runs K parallel genetic algorithms (one for each seed-size), uses crossover/mutation/repair to enforce cost and capacity bounds. Empirically achieves 8–15% higher spread than baselines on multilayer MIC networks (Zhang et al., 2024).
Embedding-aligned variable-length evolutionary algorithm (EVEA): Pareto-based multi-objective evolutionary approach with variable-length encoding and embedding-informed crossover for joint optimization of $\sigma(S)$ 6, $\sigma(S)$ 7, and $\sigma(S)$ 8. Demonstrates 19.3% higher Pareto hypervolume and 25–40% improved convergence over NSGA-II (Feng et al., 9 Sep 2025).

Subgraph Formulations and Retrieval-Augmented Methods

Graph reasoning tasks—such as those in retrieval-augmented generation (RAG) for LLMs—frame MCMI as a connected subgraph optimization, trading off node influence and edge cost with strict requirements on path comprehensiveness and explainability (Wang et al., 2 Nov 2025).

AGRAG approach: Uses a two-phase greedy: (1) 2-approximate Steiner tree to ensure connectivity; (2) Greedy neighbor expansion using marginal benefit-to-cost ratio until no further improvement. Empirically outperforms tree-based and local-walk retrievals in both reasoning quality and efficiency.

4. Extension to Multilayer and Multiplex Networks

Influence can propagate through multiple social layers simultaneously—necessitating adaptations of MCMI:

Multiplex LCI/MCMI: Propagation occurs in $\sigma(S)$ 9 network layers, possibly with overlapping users. Solutions leverage "lossless" (exact node-splitting and synchronization in an expanded graph) and "lossy" (weight/threshold aggregation) coupling mechanisms (Zhang et al., 2016). Lossless coupling ensures provable correctness at 2–4× computational cost, while lossy coupling accelerates practical computation with some loss in spread-optimality.
Overlapping users: Play outsized roles as "relays"; empirical results show that even small overlap fractions (5–7%) constitute up to 40% of the optimal seed sets and generate up to 70% of total spread for low target coverage (Zhang et al., 2016).
BCIM (Budget & Capacity in Multilayer): Extends MCMI by simultaneously constraining total seed budget and cardinality. Empirical evidence shows a multilayer genetic approach outperforms single-layer and isolated solutions by up to 15% (Zhang et al., 2024).

5. Empirical Evaluation and Performance Benchmarks

Extensive experiments support the efficacy of diverse MCMI strategies:

Algorithmic benchmarks: Polyhedral MIP with delayed cut generation closes nearly all instances ( $c_v\geq 0$ 0 nodes) to optimality rapidly; evolutionary and greedy heuristics can scale to $c_v\geq 0$ 1 nodes, obtaining solutions within 1–5% of optimal in sub-second times for large random graphs (Chen et al., 2022, Zhang et al., 2024).
Pareto efficiency: Multi-objective methods such as EVEA yield well-distributed trade-off fronts between cost, influence, and propagation time, achieving up to 19.3% greater hypervolume than previous approaches (Feng et al., 9 Sep 2025).
Steady-state welfare maximization: CIM outperforms greedy influence maximization by 1–5% in causal welfare, achieves 2–3 orders of magnitude better runtimes due to surrogate objective compression, and is robust to outcome and propagation noise (Cao et al., 12 Mar 2026).
Reasoning graphs for LLMs: MCMI subgraphs learned in AGRAG provide more comprehensive, cycle-inclusive reasoning traces, increasing chain-of-thought accuracy and faithfulness in QA tasks and reducing computational cost via explicit path constraints and token reduction (Wang et al., 2 Nov 2025).

Method	Approximation Guarantee	Empirical Speedup/Benefit
Greedy (submodular)	$c_v\geq 0$ 2	$c_v\geq 0$ 3 nodes in $c_v\geq 0$ 4 seconds
Polyhedral MIP	Exact (small graphs)	Solves $c_v\geq 0$ 5 in $c_v\geq 0$ 6min
MMGA	Heuristic (no guarantee)	8–15% more spread vs. baselines
EVEA	Pareto-efficient fronts	19.3% HV, $c_v\geq 0$ 7– $c_v\geq 0$ 8\% faster conv.
AGRAG MCMI	2-approximate	1.7× faster, improved reasoning paths

6. Practical Insights and Open Directions

Practical deployment of MCMI-inspired influence campaigns and reasoning frameworks requires:

Budget and deadline setting: Use Pareto envelopes to select budgets and temporal windows with favorable trade-off slopes (Feng et al., 9 Sep 2025).
Heuristic initialization and acceleration: For large instances, degree/influence-cost ratio and reverse influence sampling are effective (Zhang et al., 2024).
Design for robust and fair inference: Steady-state causal estimation, shape-constrained learning, and exposure mapping increase reliability in interfered or noisy settings (Cao et al., 12 Mar 2026).
Coupling for multi-platform optimization: Exploit user overlaps and cross-layer linkages; coupling consistently reduces required seed size and spreads more efficiently (Zhang et al., 2016).
Subgraph-based reasoning: Rich, cyclic, and multi-branch MCMI subgraphs improve LLM interpretability and retrieval-augmented generation quality (Wang et al., 2 Nov 2025).

Open questions involve unifying submodular and non-submodular objectives, scalable cut-generation for arbitrary graphs, integration with temporal and spatial constraints, and adaptive or online MCMI under partial feedback or evolving networks. The polyhedral and algorithmic innovations from recent work provide a rigorous foundation for these pursuits.