Papers
Topics
Authors
Recent
Search
2000 character limit reached

Graphical ICC (GICC) for Network Reproducibility

Updated 24 May 2026
  • GICC is a scalar statistic that extends traditional ICC to measure reproducibility in binary graph data using a multivariate probit mixed-effects model.
  • It computes the ratio of between-subject variance to total variance, enabling robust quantification of consistent network patterns across repeated measurements.
  • Simulations and real-data applications, including fMRI test–retest studies, validate GICC’s precision and guide optimal threshold selection for network reproducibility.

The graphical intra-class correlation coefficient (GICC) is a scalar statistic that quantifies the reproducibility of repeated network (graph) measurements, particularly for binary edge data such as those encountered in neuroimaging, genomics, and social network studies. GICC generalizes the classic intra-class correlation coefficient (ICC) from univariate or vector-valued data to binary graph-valued data, utilizing a multivariate probit-linear mixed-effects model. The GICC and its estimation framework were introduced to provide a robust, interpretable measure of reliability for complex network data (Yue et al., 2013).

1. Motivation and Conceptual Foundation

Modern applications such as repeated brain-connectivity mapping, test–retest social networks, and high-dimensional genomics routinely generate multiple binary graphs per subject or experimental unit. Classical ICC, defined as ICC=σx2σx2+σu2\mathrm{ICC} = \frac{\sigma_x^2}{\sigma_x^2 + \sigma_u^2}, captures the proportion of variance attributable to between-subject heterogeneity for scalar or vector outcomes. However, a direct analog for multivariate binary graph data was lacking.

GICC addresses this gap by providing a scalar index GICC[0,1]GICC \in [0,1] that measures the reproducible, subject-level heterogeneity in the presence/absence of edges across repeated measurements on a common node set. For a graph with NN nodes (and D=N(N1)/2D = N(N-1)/2 possible undirected edges), GICC summarizes the fraction of variance explained by consistent structure—across individuals—in the observed binary networks.

2. Model Formulation and Likelihood Specification

The GICC is built upon a multivariate probit-linear mixed-effects model for binary graphs. For each subject i=1,,Ii=1,\dots,I, visit/replicate j=1,,Jij=1,\dots,J_i, and edge d=1,,Dd=1,\dots,D, the observed edge indicator is oij(d){0,1}o_{ij}(d) \in \{0,1\}.

A latent Gaussian threshold model is introduced: oij(d)=1{yij(d)>0} yij(d)=μ(d)+xi(d)+uij(d)\begin{aligned} o_{ij}(d) &= \mathbf{1}\{y_{ij}(d) > 0\} \ y_{ij}(d) &= \mu(d) + x_i(d) + u_{ij}(d) \end{aligned} where uijN(0,ID)u_{ij} \sim N(0, I_D) is the residual (within-subject) Gaussian noise and GICC[0,1]GICC \in [0,1]0 is the between-subject random-effect, both of dimension GICC[0,1]GICC \in [0,1]1.

The full likelihood for the latent and observed variables is

GICC[0,1]GICC \in [0,1]2

where GICC[0,1]GICC \in [0,1]3 denotes the GICC[0,1]GICC \in [0,1]4-dimensional normal density with mean GICC[0,1]GICC \in [0,1]5 and covariance GICC[0,1]GICC \in [0,1]6.

3. Definition and Interpretation of GICC

Let GICC[0,1]GICC \in [0,1]7 be the GICC[0,1]GICC \in [0,1]8 between-subject random-effects covariance, and recall that the residual covariance is fixed at GICC[0,1]GICC \in [0,1]9. GICC is defined as

NN0

where:

  • NN1 is the total between-subject variance (sum of the variances across all NN2 edges).
  • NN3 is the total within-subject variance, by model construction (residuals are NN4).

GICC summarizes the proportion of the total (between + within) variance in the latent edge propensities that is due to reproducible, subject-level structure. Values approach 1 for highly reliable, subject-specific network patterns, and 0 for data dominated by measurement error.

4. Parameter Estimation via MCEM Algorithm

The model’s parameters NN5 are estimated using a Monte Carlo Expectation-Maximization (MCEM) algorithm, treating NN6 as latent (unobserved) data.

  • E-Step: Compute conditional expectations NN7, NN8, NN9 under the current parameter estimates. Since D=N(N1)/2D = N(N-1)/20 has no closed form, it is approximated via Gibbs sampling from the truncated multivariate normal (respecting the binary constraints imposed by D=N(N1)/2D = N(N-1)/21).
  • M-Step: Update D=N(N1)/2D = N(N-1)/22 and D=N(N1)/2D = N(N-1)/23 using the conditional expectations as plug-in statistics:

    D=N(N1)/2D = N(N-1)/24

    D=N(N1)/2D = N(N-1)/25

The procedure iterates until convergence, yielding MLEs for the parameters and the GICC index.

5. Identifiability, Constraints, and Assumptions

The residual covariance is fixed at D=N(N1)/2D = N(N-1)/26; this “unit-variance” identification anchors the probit link’s scale. D=N(N1)/2D = N(N-1)/27 must be positive semi-definite but is otherwise unconstrained. The multivariate probit model structure ensures that GICC is strictly interpretable as a variance ratio for graph-valued data.

6. Simulation Performance and Empirical Application

Simulation and real-data studies demonstrate the utility of GICC. In simulation, using setups with D=N(N1)/2D = N(N-1)/28 subjects, D=N(N1)/2D = N(N-1)/29 repeated graphs, i=1,,Ii=1,\dots,I0 nodes (i=1,,Ii=1,\dots,I1), and i=1,,Ii=1,\dots,I2 with i=1,,Ii=1,\dots,I3, MCEM estimation of GICC was accurate and precise. For instance:

  • For i=1,,Ii=1,\dots,I4 and true i=1,,Ii=1,\dots,I5, the mean MCEM estimate was approximately 0.702 (std 0.033).
  • Increasing i=1,,Ii=1,\dots,I6 or i=1,,Ii=1,\dots,I7 further reduced bias and variance.

A real-data application to the KIRBY21 test-retest fMRI dataset used 21 subjects each scanned twice. After graph construction (via thresholding 7×7 correlation matrices), GICC as a function of the threshold i=1,,Ii=1,\dots,I8 indicated strong reproducibility (i=1,,Ii=1,\dots,I9) for thresholds in j=1,,Jij=1,\dots,J_i0, with an optimum reproducibility at j=1,,Jij=1,\dots,J_i1. When j=1,,Jij=1,\dots,J_i2 was increased past 0.6, j=1,,Jij=1,\dots,J_i3 decreased sharply as most edges vanished.

7. Summary and Implications

GICC extends ICC methodology to binary graph-valued data through a principled multivariate probit mixed-effects framework. The estimator has clear interpretation as the reproducible fraction of variance in latent network structure. Parameter estimation leverages MCEM with Gibbs sampling for latent variables. Simulation and real-data analysis support its robustness and applicability both as a reproducibility metric and as a guide for methodological choices such as threshold setting in graph construction.

This suggests GICC is positioned as an essential tool for quantitative reproducibility analysis in graph-based studies where binary connectivity patterns are subject to test–retest or multi-session variability (Yue et al., 2013).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Graphical ICC (GICC).