sSID: Sign-Augmented Structural Intervention Distance
- sSID is a metric that extends the Structural Intervention Distance by penalizing both structural errors and sign discrepancies in causal effect estimation.
- It introduces a weighted variant, wsSID, which quantifies the magnitude of polarity deviations to handle partial cancellations in biological networks.
- The algorithm employs matrix algebra and path enumeration to efficiently evaluate both reachability and directional accuracy, supporting improved model selection for interventions.
The sign-augmented Structural Intervention Distance (sSID) extends the Structural Intervention Distance (SID) framework for causal graph comparison by incorporating the directionality (sign) of causal effects. Developed in the context of biological networks where reference annotations are qualitative or sign-valued, sSID penalizes both incorrectly inferred interventional targets and disagreement in the net sign of “total-effect” causal relationships. A further weighted version, wsSID, quantifies the magnitude of polarity deviation and addresses partial path cancellations. sSID is particularly relevant for evaluating causal graphs inferred from biological data, where direction and sign of gene regulation or intervention effects are often critical for downstream applications such as drug target identification or clinical outcome prediction (Sato et al., 16 Nov 2025).
1. Formal Definition and Mathematical Framework
Let be a directed acyclic graph (DAG) with weighted adjacency matrix (entries ), encoding direct causal effects. The total-effect matrix is defined as
so that sums the weights of all directed paths from to . For sign annotation,
with NA denoting “undefined” due to missing sign annotation.
For graphs (reference) and (estimate), the pairwise SID indicator if the adjustment set implied by for is incorrect relative to (i.e., the -distributions differ), and $0$ otherwise (Peters et al., 2013). The sign-error indicator is
$0$ otherwise.
The sign-augmented SID is then defined as
with ; default .
The weighted variant (wsSID) is based on the average polarity
over all fully annotated directed paths from to , and the polarity difference
Thus,
2. Algorithmic Computation
Efficient computation of sSID and wsSID employs both matrix algebra and path enumeration:
- Construct adjacency matrices and for the reference and estimated graphs, with .
- Compute total-effect matrices: , or via for acyclic graphs.
- Derive sign matrices .
- Enumerate all fully annotated directed paths to obtain average polarities for all pairs.
- For each pair : evaluate , , and .
- Aggregate as per the sSID or wsSID definition.
Complexity depends on sparsity and graph size. Computation of is for dense graphs, with dynamic programming enabling practical performance in sparse graphs (Sato et al., 16 Nov 2025).
3. Theoretical Properties
Key mathematical and operational properties include:
- Non-negativity: always, with equality iff both structure and effect sign are matched for all pairs.
- Monotonicity: Adding structural or sign errors strictly increases sSID.
- Asymmetry: Generally, due to the directed comparison inherent in SID.
- Sensitivity: Sign errors propagate along reachable descendant pairs, allowing sSID to penalize incorrect sign orientation even if structural reachability is preserved.
- Robustness: Sign errors are only penalized when reachability is correct (no mistaken intervention in adjustment sets), avoiding duplicated penalties.
- Limitations: The metric presumes linear or monotonic relationships where sign annotations are meaningful; not directly applicable to nonparametric or cyclic structures (Sato et al., 16 Nov 2025).
4. Illustrative Examples and Interpretation
Concrete scenarios clarify sSID’s discriminative power:
| Example | SID Outcome | sSID Outcome | wsSID Outcome |
|---|---|---|---|
| Simple chain (A→B→C), sign flip on A→B | 0 | 0.5 (sign error A→C) | 1.0 () |
| Parallel paths, partial sign cancellation | 0 | 0 | 1.0 (partial polarity loss) |
In structures with parallel or multi-step paths, wsSID quantifies graded disagreement () even when the discrete sign majority (and thus sSID) is unchanged. Thus, wsSID resolves partial cancellations, while sSID detects binary sign inversions.
5. Comparison to Other Metrics: SID and SHD
| Metric | Penalizes | Sensitivity | Distinguishes Functional Errors? |
|---|---|---|---|
| SID | Wrong adjustment sets (structure only) | Insensitive to effect sign | No |
| sSID | SID + sign flips | Sensitive to direction of effect | Yes |
| wsSID | sSID + average polarity deviation | Graded, path-wise effect magnitude | Yes (quantitative) |
SID counts only mismatches in interventional reachability. sSID adds penalties for sign disagreement in total effects, capturing “functionally wrong” inferences undetectable by structural metrics alone. wsSID further quantifies magnitude of disagreement for partially cancelling errors (Sato et al., 16 Nov 2025).
6. Empirical Performance and Applications
Empirical analysis in simulations and cancer transcriptomic datasets demonstrates that sSID and wsSID select different estimator algorithms relative to SID. For example, in gene expression networks inferred for LUAD and PRAD TCGA datasets, networks optimizing sSID achieved higher classification precision-recall for tumor stage using predicted gene expression features (AUPRC ≈ 0.764 for sSID-optimal vs. ≈ 0.749 for SID-optimal) (Sato et al., 16 Nov 2025).
Simulations show sSID and wsSID grow linearly with the frequency of sign-flip errors, while SID and SHD do not detect such errors. wsSID additionally discerns partial polarity loss due to parallel paths, demonstrating value in biological contexts where signed causal direction is fundamental.
7. Practical Considerations and Implementation
Efficient sSID/wsSID usage requires:
- Data requirements: Reference networks annotated with {+1, −1, NA}; estimated networks with quantitative weights.
- Computational cost: Preferential for sparse DAGs; dynamic programming methods recommended for large or complex path structure.
- Choice of : Default is , with higher values for sign-sensitive applications.
- Software: Public R/C++ implementations support sSID/wsSID computation, with integration to common structure learning toolkits.
The incorporation of sign-awareness mediates more biologically and functionally meaningful model evaluation, providing discrimination between “structurally right” but “functionally wrong” networks and thereby guiding model selection for downstream intervention analysis (Sato et al., 16 Nov 2025).