Graph Open-Set Recognition (GOSR)

Updated 25 November 2025

Graph Open-Set Recognition is a framework that extends traditional graph classification by explicitly detecting unknown classes using threshold-based rejection.
It employs techniques such as proxy-based unknown generation, prototype learning, and anomaly detection to enhance both node-level and graph-level performance.
Research in GOSR highlights challenges like threshold sensitivity and classification-detection trade-offs, driving improvements in robustness and domain adaptation.

Graph Open-Set Recognition (GOSR) is a research area within the broader context of graph representation learning, focusing on the problem of classifying nodes or entire graphs not only when their class labels are among the set observed during training (“known classes”), but also detecting and properly rejecting (“recognizing as unknown”) samples originating from truly novel, unseen classes. Unlike the standard closed-set scenario—where all test data are assumed to belong to known classes—GOSR methods are explicitly designed to address the class distribution shift arising from the appearance of unmodeled, out-of-distribution (OOD) categories, a scenario arising frequently in practical deployments of graph-based machine learning.

1. Formal Problem Definition

Graph Open-Set Recognition generalizes closed-set node/graph classification by requiring a model to both accurately assign samples to known classes and to explicitly detect when a sample comes from an unknown class. The classic closed-set classifier selects class $\hat{y} = \arg\max_{k \in \mathcal{Y}_{\mathrm{known}}} P(y=k \mid x)$ , where $x$ is a node or graph and $P$ is the model's conditional probability. In GOSR, the classifier applies a joint decision rule with an explicit unknown rejection:

$\hat{y} = \begin{cases} \arg\max_{k \in \mathcal{Y}_{\mathrm{known}}} P(y=k \mid x), & \max_k P(y=k \mid x) \geq \delta \ \mathrm{``unknown''}, & \text{otherwise} \end{cases}$

where the threshold $\delta$ separates known from unknowns (Dong et al., 1 Mar 2025).

Two prominent GOSR settings exist:

Node-level: Assign each graph node to one of the known classes or to "unknown".
Graph-level: Assign each whole graph (e.g., a molecule) to a known class or recognize it as unknown.

The open-set condition implies that during testing, data can belong to $\mathcal{Y}_{\text{unknown}}$ , which is disjoint from $\mathcal{Y}_{\text{known}}$ provided to the learner during training (Dong et al., 1 Mar 2025, Zhang et al., 28 Feb 2024).

2. Methodological Taxonomy and Benchmarks

A key advance in the understanding and evaluation of GOSR methods is provided by the "G-OSR" benchmark, which standardizes open-set splits and evaluation metrics across a broad suite of datasets and method categories (Dong et al., 1 Mar 2025). GOSR approaches can be grouped as follows:

Traditional closed-set GNNs with OOD extensions: Standard GNNs (GCN, GAT, GIN) trained by cross-entropy, equipped post hoc with an OOD score—such as Max-Softmax Prediction (MSP), OpenMax extended with a Weibull probability for “unknown”, or LogitNorm (Dong et al., 1 Mar 2025).

GOODD (Graph Out-of-Distribution Detection) techniques: Methods like GraphDE, GOOD-D, and SGOOD, which focus on OOD scoring (energy, Mahalanobis distance), and adapt this score for open-set gating.

Dedicated GOSR methods: Models like OpenWGL, EM-GOSR, and GOOSE introduce explicit reject heads, entropy-maximization on pseudo-outliers, or synthesis of negative samples via graph augmentation, directly optimizing open-set classification.

Graph anomaly detection adaptations: GAD models (GAE-Recon, CoLA, GUIDE, ConAD) reconstruct node/graph features or engage in contrastive learning to develop anomaly scores, then combine these with a separate classifier for final prediction.

The G-OSR benchmark provides comprehensive comparisons with common evaluation metrics—AUROC, FPR@95%TPR, and open-set F1—across node- and graph-level tasks, showing that dedicated GOSR models consistently outperform OOD-scoring extensions and anomaly detection baselines on open-set F1 and AUROC (Dong et al., 1 Mar 2025).

3. Generative and Proxy-Based GOSR Approaches

Inductive open-set node classification, where no features or labels of unknown-class nodes are visible until test time, is addressed by $\mathcal{G}^2\mathrm{Pxy}$ (Zhang et al., 2023). This method generates two classes of proxy unknown nodes using manifold mixup: inter-class proxies are created by mixing hidden representations of endpoints from edges with differing labels; external proxies are produced by perturbing peripheral known-class nodes away from their class center embedding. These proxies are then assigned an augmented “unknown” class during training, extending the classifier's output space to $C+1$ (knowns plus unknown):

$W_{\mathrm{open}} = [W_{\mathrm{close}} \mid w_{\mathrm{proxy}}] \in \mathbb{R}^{d_K \times (C+1)}$

Training employs both cross-entropy and a complement entropy loss which encourages–for proxy samples–low confidence assignment to any known class and, for real known samples, lowers the probability of the proxy (unknown) class. Weak or no reliance on explicit rejection thresholds makes this approach robust in inductive regimes. Empirical results on citation graphs (Cora, Citeseer, DBLP, PubMed) show that $\mathcal{G}^2 \mathrm{Pxy}$ surpasses strong baselines in both accuracy and macro-F1 for open-set tasks (Zhang et al., 2023).

Similarly, OGCIL (Chen et al., 23 Jul 2025) introduces a prototypical conditional VAE to synthesize pseudo in-distribution samples for incremental learning. Unknown (OOD) samples are generated as linear mixes of embeddings from different known classes, and a prototypical hypersphere classification loss is used to enforce compact known-class clusters and vigorous rejection of OODs by distance from prototypes:

$\ell(h, p_c) = \exp(-\|h - p_c\|^2)$

The inference rule is: assign to the closest prototype if above threshold, else predict unknown. On standard benchmarks, this method provides state-of-the-art OSCR (open-set classification rate), closed-set accuracy, and AUC measures (Chen et al., 23 Jul 2025).

4. Prototype and Region-Based Discrimination

Prototype-based learning frameworks have demonstrated significant robustness for GOSR under both in-distribution (IND) and out-of-distribution (OOD) label noise. The ROG $_{PL}$ method (Zhang et al., 28 Feb 2024) introduces a two-stage model:

Denoising via label propagation on a k-NN graph in the latent space, followed by removal of low-confidence nodes.
Region-based prototype learning: for each class, multiple regions (via k-means clustering) produce both interior (homogeneous clusters) and border (mixed clusters) prototypes.

For inference, cosine similarities between node representations and class prototypes are maximized, with a threshold for detection of unknowns. The overall loss blends a classification term (with label smoothing) and an orthogonality-promoting diversity term:

$L_{div} = \|P_I P_I^\top - I_C\|_F^2$

Ablation studies confirm that both denoising and explicit border/interior prototypes are critical for performance under high label noise and OOD training contamination. ROG $_{PL}$ sets a new baseline for robustness in GOSR by outperforming prior art in macro-F1 and AUROC (Zhang et al., 28 Feb 2024).

5. Domain Adaptation and Cross-Graph Settings

Unsupervised open-set graph domain adaptation is investigated in the GraphRTA framework (Zhang et al., 21 Oct 2025). This scenario extends GOSR to situations where a labeled source graph $(\mathcal{G}_s)$ and an unlabeled target graph $(\mathcal{G}_t)$ differ in label space. The goal is to classify nodes from $\mathcal{G}_t$ into either known classes (seen in $\mathcal{G}_s$ ) or “unknown”. GraphRTA leverages:

Graph reprogramming: Learning perturbations in target graph structure and features to enhance known/unknown separability.
Model reprogramming: Pruning domain-specific parameters to reduce bias towards the source, retaining only transferable parameters.
Open-set classifier extension: Explicitly augmenting the classifier output with an “unknown” dimension, thus eliminating manual thresholding.

Experimental results demonstrate competitive or superior performance compared to recent state-of-the-art methods, with the explicit unknown-class output simplifying the open-set decision process (Zhang et al., 21 Oct 2025).

6. Specialized Architectures and Statistical Thresholding

In open-set malware classification, FCG-based representations and self-supervised pre-training can produce permutation-invariant embeddings well-suited for unknown class detection (Jia et al., 2022). The proposed approach uses function call graph (FCG) representations, isomorphic graph transformations (shifting and permuting adjacency matrices), and a Detransformation Autoencoder (DTAE) for self-supervised learning. Unknown detection relies on normalized distances to class centroids (3-sigma rule):

$\text{outlier}(z) = \min_k \frac{\left| \|z - \mu_k\| - m_k \right|}{s_k}$

where $\mu_k$ is the class centroid, $m_k$ , $s_k$ are mean and std of the within-class distances. Empirically, this threshold performs on par with tuned quantiles but is parameter free.

7. Challenges, Performance Trends, and Open Directions

The G-OSR survey reveals several persistent challenges (Dong et al., 1 Mar 2025):

Threshold Sensitivity: All GOSR methods (except some proxy-based approaches) depend on explicit or implicit thresholding for unknown rejection, with settings sensitive to dataset and domain shifts.
Classification-Detection Trade-off: GOODD methods excel at unknown detection, but often underperform on discriminating known classes, while closed-set classifiers struggle with unknowns.
Generalization and Scalability: Prototype and proxy-based methods (e.g., ROG $_{PL}$ , $\mathcal{G}^2$ Pxy) demonstrate superior robustness under complex noise and strict inductive conditions but may encounter scalability or hyperparameter tuning issues.
Dynamic/Continual Learning: OGCIL shows how pseudo-replay and explicit OOD simulation can enable effective open-set class-incremental learning, but practical scenarios (dynamic graphs, temporally-evolving node/graph sets) remain a frontier for GOSR research (Chen et al., 23 Jul 2025).

Future research directions highlighted by the literature include dynamic or meta-learned thresholding, hierarchical-semantic open-set learning, seamless integration with domain adaptation, and scale-up to industrial graph datasets (Dong et al., 1 Mar 2025). The provision of unified benchmarks and taxonomies via G-OSR is expected to accelerate systematic advances in this area.

Key References:

"G-OSR: A Comprehensive Benchmark for Graph Open-Set Recognition" (Dong et al., 1 Mar 2025)
" $\mathcal{G}^2$ Pxy: Generative Open-Set Node Classification on Graphs with Proxy Unknowns" (Zhang et al., 2023)
"ROG $_{PL}$ : Robust Open-Set Graph Learning via Region-Based Prototype Learning" (Zhang et al., 28 Feb 2024)
"Towards Effective Open-set Graph Class-incremental Learning" (Chen et al., 23 Jul 2025)
"Towards Unsupervised Open-Set Graph Domain Adaptation via Dual Reprogramming" (Zhang et al., 21 Oct 2025)
"Representation learning with function call graph transformations for malware open set recognition" (Jia et al., 2022)