Papers
Topics
Authors
Recent
Search
2000 character limit reached

Oxytrees: Efficient Proxy-Based Biclustering Trees

Updated 23 November 2025
  • Oxytrees are proxy-based biclustering model trees that facilitate efficient bipartite learning on sparse interaction matrices.
  • They employ proxy-based acceleration to compress interaction matrices into row and column proxies, dramatically reducing computational complexity.
  • Their integration of kernelized leaf models using Kronecker product kernels yields expressive local predictions with faster training and inference.

Oxytrees are proxy-based biclustering model trees designed for efficient and interpretable bipartite learning—predicting binary interactions for pairs from two feature domains. They address scale, inductive generalization, and interpretability in settings where interactions are organized as sparse matrices, such as drug–target, RNA–disease, or regulatory networks. Oxytrees improve upon previous biclustering forests by compressing the interaction matrix into row and column proxies, enabling fast construction and inference with shallower trees whose leaves host expressive linear models parameterized by Kronecker product kernels (Ilídio et al., 16 Nov 2025).

1. Bipartite Learning Setting and Motivation

Bipartite learning seeks to infer a function Yij=f(x1i,x2j){0,1}Y^{ij} = f(x_1^i, x_2^j) \in \{0,1\} given samples X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1} and X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}, forming a sparse bipartite interaction matrix Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}. Challenges arise from:

  • Scalability: n1n_1 and n2n_2 may both reach thousands.
  • Generalization: Models must predict for unseen x1,x2x_1, x_2 instances (inductive setting).
  • Interpretability: The goal is a two-dimensional partitioning—biclustering—that remains comprehensible and locally predictive.

Previous state-of-the-art biclustering forests such as BICTR scale as Θ(mn2logn)\Theta(m n^2 \log n) and utilize constant-value leaf predictions, limiting both efficiency and model expressivity. Oxytrees integrate proxy-based acceleration, efficient batch inference, and model tree structure with linear (kernelized) leaves to address these limitations (Ilídio et al., 16 Nov 2025).

2. Proxy-based Biclustering and Split Scoring

Oxytrees exploit the property that impurity measures on interaction submatrices can be reformulated in a proxy-friendly form:

I(Ynode)=ρ(i,jμ(Ynodeij))I(Y_\text{node}) = \rho\left( \sum_{i,j} \mu(Y^{ij}_\text{node}) \right)

For variance impurity, μ(z)=(1,z,z2)\mu(z) = (1, z, z^2) and X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}0. This enables computation of two proxy matrices per node:

  • Row proxy: X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}1
  • Column proxy: X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}2

Horizontal splits use prefix sums of X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}3, while vertical splits use X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}4, ensuring that all split candidates per feature are evaluated in X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}5 per threshold. The computational complexity is reduced to X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}6, a theoretical and observed X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}7–X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}8 fold speedup relative to traditional biclustering trees (Ilídio et al., 16 Nov 2025).

3. Model Tree Construction and Leaf Modeling

Oxytrees construct a binary tree top-down. At each node, X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}9 features are randomly sampled from X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}0 or X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}1, akin to extremely randomized trees. Each candidate split is evaluated for impurity reduction X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}2, computed expediently via the proxy mechanism. Splits with the maximum X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}3 are chosen, and growth halts if X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}4 falls below a minimum or X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}5.

Distinctly, Oxytrees assign each leaf a local response matrix X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}6 and fit a linear model parameterized by a Kronecker product kernel. This approach improves expressivity over mean-value leaves and yields shallower, more efficient trees (Ilídio et al., 16 Nov 2025).

4. Efficient Batch Leaf Assignment and Inference

For prediction over Cartesian products X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}7, Oxytrees avoid redundant traversals. The inference procedure is:

  • Horizontal splits partition X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}8, passing all of X2={x2j}j=1n2Rm2X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}9 downstream.
  • Vertical splits partition Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}0, passing all of Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}1 downstream.
  • At each leaf, all predictions for Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}2 are computed in batch via matrix multiplication.

This routine requires each test instance be visited only once per tree depth, with empirical complexity Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}3, compared to Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}4 for previous state-of-the-art methods (Ilídio et al., 16 Nov 2025).

5. Kronecker Product Kernel Models in Leaves

Each leaf utilizes Regularized Least Squares regression with a Kronecker product kernel Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}5. Given training Gram matrices Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}6 and Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}7 for leaf instances, eigendecompositions Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}8 and Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}9 produce the optimal weight matrix:

n1n_10

Predictions for test batch similarity matrices n1n_11, n1n_12 are:

n1n_13

This structure allows efficient, batch-wise computation for any leaf-level domain sizes (Ilídio et al., 16 Nov 2025).

6. Empirical Results and Performance Benchmarks

Oxytrees were evaluated on 15 bipartite datasets, including drug–nuclear receptor, kinase inhibitors, and lncRNA–disease, with sizes ranging from n1n_14 to n1n_15 and densities 1–20%. Baselines included MLPs on concatenated features, local models (RLS-avg, BLMNII, WkNNIR), RLS-Kron, NRLMF, and BICTR.

  • Validation encompassed instance-wise + dyad-wise splits yielding transductive (TD), semi-inductive (LT/TL), and fully inductive (TT) test sets; positive-unlabeled (PU) masking at 0–75%.
  • Metrics included AUROC, AUPRC, and Friedman + Nemenyi tests.

Key outcomes:

  • Predictive performance: Oxytrees match or outperform BICTR and RLS-Kron, notably in inductive scenarios.
  • Efficiency: Oxytrees train %%%%46Y{0,1}n1×n2Y \in \{0,1\}^{n_1 \times n_2}547%%%% faster and predict %%%%48X1={x1i}i=1n1Rm1X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}049%%%% faster than BICTR (p < n2n_20).
  • Model parsimony: Oxytrees reach 98% accuracy with n2n_2142% fewer trees due to expressive leaves.
  • Proxy mechanism and leaf model ablation confirm that these components are crucial; removing leaf models degrades performance.
  • Robustness: Maintained accuracy under PU-masking and with increased minimum leaf size (Ilídio et al., 16 Nov 2025).

7. Software Implementation and Reproducibility

A Python package "bipartite_learn" provides the Oxytrees framework, including:

  • OxytreeClassifier class with fit(X1, X2, Y), predict(X1, X2), predict_proba(), and score()
  • hyperparameters for number of trees, minimum leaf size, feature subsample size, impurity, regularization (n2n_22), and kernel functions n2n_23, n2n_24
  • efficient batch inference
  • access to all 15 preprocessed datasets and evaluation/PU-masking utilities

All results are reproducible via open-source code, with tutorials and Jupyter notebooks available at https://github.com/pedroilidio/oxytrees2025.

Component Characteristic Complexity/Efficiency
Split Scoring Proxy row/col matrices (n2n_25) n2n_26
Leaf Model Kronecker-kernel RLS Efficient per-leaf SVD
Inference Batch, single-pass per test set n2n_27

Oxytrees thereby constitute an interpretable, fast, and accurate methodology for inductive bipartite learning, combining proxy-based biclustering, expressive kernelized leaf modeling, and scalable batch inference (Ilídio et al., 16 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Oxytrees.