Effect of maximum edge size on HCM predictive performance for USPTO

Determine whether the small maximum edge size of the USPTO organic reactions hypergraph (maximum edge size 8), relative to the iAF1260b (maximum edge size 67) and iJO1366 (maximum edge size 106) metabolic reaction hypergraphs, contributes to the comparatively poor link prediction performance of the Hyperedge Copy Model (HCM) on the USPTO dataset under the benchmarking setup described in Table 2.

Background

The paper benchmarks the Hyperedge Copy Model (HCM) against two neural network methods, NHP and LHP, on three datasets: iAF1260b, iJO1366 (both metabolic reaction hypergraphs), and USPTO (an organic reactions hypergraph). Using the same training/testing split and negative edge generation procedure as prior work, the authors report AUC and F1 metrics for each method.

HCM substantially outperforms both neural baselines on iJO1366, performs better than NHP but worse than LHP on iAF1260b, and performs worse than both neural methods on USPTO. The authors note that USPTO has a much smaller maximum edge size (8) compared to iAF1260b (67) and iJO1366 (106), and explicitly conjecture that this difference may contribute to HCM’s comparatively poor performance on USPTO.

References

We conjecture that the comparatively poor performance of our model reflects in part the small maximum edge size of this data set compared to the other two.

— Hypergraph Link Prediction via Hyperedge Copying (2502.02386 - He et al., 4 Feb 2025) in Results, paragraph discussing Table 2 (label 'tab:benchmarking')

Effect of maximum edge size on HCM predictive performance for USPTO

Background

References

Related Problems