Graphical ICC (GICC) for Network Reproducibility

Updated 24 May 2026

GICC is a scalar statistic that extends traditional ICC to measure reproducibility in binary graph data using a multivariate probit mixed-effects model.
It computes the ratio of between-subject variance to total variance, enabling robust quantification of consistent network patterns across repeated measurements.
Simulations and real-data applications, including fMRI test–retest studies, validate GICC’s precision and guide optimal threshold selection for network reproducibility.

The graphical intra-class correlation coefficient (GICC) is a scalar statistic that quantifies the reproducibility of repeated network (graph) measurements, particularly for binary edge data such as those encountered in neuroimaging, genomics, and social network studies. GICC generalizes the classic intra-class correlation coefficient (ICC) from univariate or vector-valued data to binary graph-valued data, utilizing a multivariate probit-linear mixed-effects model. The GICC and its estimation framework were introduced to provide a robust, interpretable measure of reliability for complex network data (Yue et al., 2013).

1. Motivation and Conceptual Foundation

Modern applications such as repeated brain-connectivity mapping, test–retest social networks, and high-dimensional genomics routinely generate multiple binary graphs per subject or experimental unit. Classical ICC, defined as $\mathrm{ICC} = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_u^2}$ , captures the proportion of variance attributable to between-subject heterogeneity for scalar or vector outcomes. However, a direct analog for multivariate binary graph data was lacking.

GICC addresses this gap by providing a scalar index $GICC \in [0,1]$ that measures the reproducible, subject-level heterogeneity in the presence/absence of edges across repeated measurements on a common node set. For a graph with $N$ nodes (and $D = N(N-1)/2$ possible undirected edges), GICC summarizes the fraction of variance explained by consistent structure—across individuals—in the observed binary networks.

2. Model Formulation and Likelihood Specification

The GICC is built upon a multivariate probit-linear mixed-effects model for binary graphs. For each subject $i=1,\dots,I$ , visit/replicate $j=1,\dots,J_i$ , and edge $d=1,\dots,D$ , the observed edge indicator is $o_{ij}(d) \in \{0,1\}$ .

A latent Gaussian threshold model is introduced: $\begin{aligned} o_{ij}(d) &= \mathbf{1}\{y_{ij}(d) > 0\} \ y_{ij}(d) &= \mu(d) + x_i(d) + u_{ij}(d) \end{aligned}$ where $u_{ij} \sim N(0, I_D)$ is the residual (within-subject) Gaussian noise and $GICC \in [0,1]$ 0 is the between-subject random-effect, both of dimension $GICC \in [0,1]$ 1.

The full likelihood for the latent and observed variables is

$GICC \in [0,1]$ 2

where $GICC \in [0,1]$ 3 denotes the $GICC \in [0,1]$ 4-dimensional normal density with mean $GICC \in [0,1]$ 5 and covariance $GICC \in [0,1]$ 6.

3. Definition and Interpretation of GICC

Let $GICC \in [0,1]$ 7 be the $GICC \in [0,1]$ 8 between-subject random-effects covariance, and recall that the residual covariance is fixed at $GICC \in [0,1]$ 9. GICC is defined as

$N$ 0

where:

$N$ 1 is the total between-subject variance (sum of the variances across all $N$ 2 edges).
$N$ 3 is the total within-subject variance, by model construction (residuals are $N$ 4).

GICC summarizes the proportion of the total (between + within) variance in the latent edge propensities that is due to reproducible, subject-level structure. Values approach 1 for highly reliable, subject-specific network patterns, and 0 for data dominated by measurement error.

4. Parameter Estimation via MCEM Algorithm

The model’s parameters $N$ 5 are estimated using a Monte Carlo Expectation-Maximization (MCEM) algorithm, treating $N$ 6 as latent (unobserved) data.

E-Step: Compute conditional expectations $N$ 7, $N$ 8, $N$ 9 under the current parameter estimates. Since $D = N(N-1)/2$ 0 has no closed form, it is approximated via Gibbs sampling from the truncated multivariate normal (respecting the binary constraints imposed by $D = N(N-1)/2$ 1).
M-Step: Update $D = N(N-1)/2$ 2 and $D = N(N-1)/2$ 3 using the conditional expectations as plug-in statistics:

$D = N(N-1)/2$ 4

$D = N(N-1)/2$ 5

The procedure iterates until convergence, yielding MLEs for the parameters and the GICC index.

5. Identifiability, Constraints, and Assumptions

The residual covariance is fixed at $D = N(N-1)/2$ 6; this “unit-variance” identification anchors the probit link’s scale. $D = N(N-1)/2$ 7 must be positive semi-definite but is otherwise unconstrained. The multivariate probit model structure ensures that GICC is strictly interpretable as a variance ratio for graph-valued data.

6. Simulation Performance and Empirical Application

Simulation and real-data studies demonstrate the utility of GICC. In simulation, using setups with $D = N(N-1)/2$ 8 subjects, $D = N(N-1)/2$ 9 repeated graphs, $i=1,\dots,I$ 0 nodes ( $i=1,\dots,I$ 1), and $i=1,\dots,I$ 2 with $i=1,\dots,I$ 3, MCEM estimation of GICC was accurate and precise. For instance:

For $i=1,\dots,I$ 4 and true $i=1,\dots,I$ 5, the mean MCEM estimate was approximately 0.702 (std 0.033).
Increasing $i=1,\dots,I$ 6 or $i=1,\dots,I$ 7 further reduced bias and variance.

A real-data application to the KIRBY21 test-retest fMRI dataset used 21 subjects each scanned twice. After graph construction (via thresholding 7×7 correlation matrices), GICC as a function of the threshold $i=1,\dots,I$ 8 indicated strong reproducibility ( $i=1,\dots,I$ 9) for thresholds in $j=1,\dots,J_i$ 0, with an optimum reproducibility at $j=1,\dots,J_i$ 1. When $j=1,\dots,J_i$ 2 was increased past 0.6, $j=1,\dots,J_i$ 3 decreased sharply as most edges vanished.

7. Summary and Implications

GICC extends ICC methodology to binary graph-valued data through a principled multivariate probit mixed-effects framework. The estimator has clear interpretation as the reproducible fraction of variance in latent network structure. Parameter estimation leverages MCEM with Gibbs sampling for latent variables. Simulation and real-data analysis support its robustness and applicability both as a reproducibility metric and as a guide for methodological choices such as threshold setting in graph construction.

This suggests GICC is positioned as an essential tool for quantitative reproducibility analysis in graph-based studies where binary connectivity patterns are subject to test–retest or multi-session variability (Yue et al., 2013).

Markdown Report Issue Upgrade to Chat

References (1)

Estimating a graphical intra-class correlation coefficient (GICC) using multivariate probit-linear mixed models (2013)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Graphical ICC (GICC).

Graphical ICC (GICC) for Network Reproducibility

1. Motivation and Conceptual Foundation

2. Model Formulation and Likelihood Specification

3. Definition and Interpretation of GICC

4. Parameter Estimation via MCEM Algorithm

5. Identifiability, Constraints, and Assumptions

6. Simulation Performance and Empirical Application

7. Summary and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Graphical ICC (GICC) for Network Reproducibility

1. Motivation and Conceptual Foundation

2. Model Formulation and Likelihood Specification

3. Definition and Interpretation of GICC

4. Parameter Estimation via MCEM Algorithm

5. Identifiability, Constraints, and Assumptions

6. Simulation Performance and Empirical Application

7. Summary and Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research