Oxytrees: Efficient Proxy-Based Biclustering Trees

Updated 23 November 2025

Oxytrees are proxy-based biclustering model trees that facilitate efficient bipartite learning on sparse interaction matrices.
They employ proxy-based acceleration to compress interaction matrices into row and column proxies, dramatically reducing computational complexity.
Their integration of kernelized leaf models using Kronecker product kernels yields expressive local predictions with faster training and inference.

Oxytrees are proxy-based biclustering model trees designed for efficient and interpretable bipartite learning—predicting binary interactions for pairs from two feature domains. They address scale, inductive generalization, and interpretability in settings where interactions are organized as sparse matrices, such as drug–target, RNA–disease, or regulatory networks. Oxytrees improve upon previous biclustering forests by compressing the interaction matrix into row and column proxies, enabling fast construction and inference with shallower trees whose leaves host expressive linear models parameterized by Kronecker product kernels (Ilídio et al., 16 Nov 2025).

1. Bipartite Learning Setting and Motivation

Bipartite learning seeks to infer a function $Y^{ij} = f(x_1^i, x_2^j) \in \{0,1\}$ given samples $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ and $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ , forming a sparse bipartite interaction matrix $Y \in \{0,1\}^{n_1 \times n_2}$ . Challenges arise from:

Scalability: $n_1$ and $n_2$ may both reach thousands.
Generalization: Models must predict for unseen $x_1, x_2$ instances (inductive setting).
Interpretability: The goal is a two-dimensional partitioning—biclustering—that remains comprehensible and locally predictive.

Previous state-of-the-art biclustering forests such as BICTR scale as $\Theta(m n^2 \log n)$ and utilize constant-value leaf predictions, limiting both efficiency and model expressivity. Oxytrees integrate proxy-based acceleration, efficient batch inference, and model tree structure with linear (kernelized) leaves to address these limitations (Ilídio et al., 16 Nov 2025).

2. Proxy-based Biclustering and Split Scoring

Oxytrees exploit the property that impurity measures on interaction submatrices can be reformulated in a proxy-friendly form:

$I(Y_\text{node}) = \rho\left( \sum_{i,j} \mu(Y^{ij}_\text{node}) \right)$

For variance impurity, $\mu(z) = (1, z, z^2)$ and $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 0. This enables computation of two proxy matrices per node:

Row proxy: $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 1
Column proxy: $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 2

Horizontal splits use prefix sums of $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 3, while vertical splits use $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 4, ensuring that all split candidates per feature are evaluated in $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 5 per threshold. The computational complexity is reduced to $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 6, a theoretical and observed $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 7– $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 8 fold speedup relative to traditional biclustering trees (Ilídio et al., 16 Nov 2025).

3. Model Tree Construction and Leaf Modeling

Oxytrees construct a binary tree top-down. At each node, $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 9 features are randomly sampled from $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 0 or $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 1, akin to extremely randomized trees. Each candidate split is evaluated for impurity reduction $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 2, computed expediently via the proxy mechanism. Splits with the maximum $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 3 are chosen, and growth halts if $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 4 falls below a minimum or $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 5.

Distinctly, Oxytrees assign each leaf a local response matrix $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 6 and fit a linear model parameterized by a Kronecker product kernel. This approach improves expressivity over mean-value leaves and yields shallower, more efficient trees (Ilídio et al., 16 Nov 2025).

4. Efficient Batch Leaf Assignment and Inference

For prediction over Cartesian products $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 7, Oxytrees avoid redundant traversals. The inference procedure is:

Horizontal splits partition $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 8, passing all of $X_2 = \{x_2^j\}_{j=1}^{n_2} \subseteq \mathbb{R}^{m_2}$ 9 downstream.
Vertical splits partition $Y \in \{0,1\}^{n_1 \times n_2}$ 0, passing all of $Y \in \{0,1\}^{n_1 \times n_2}$ 1 downstream.
At each leaf, all predictions for $Y \in \{0,1\}^{n_1 \times n_2}$ 2 are computed in batch via matrix multiplication.

This routine requires each test instance be visited only once per tree depth, with empirical complexity $Y \in \{0,1\}^{n_1 \times n_2}$ 3, compared to $Y \in \{0,1\}^{n_1 \times n_2}$ 4 for previous state-of-the-art methods (Ilídio et al., 16 Nov 2025).

5. Kronecker Product Kernel Models in Leaves

Each leaf utilizes Regularized Least Squares regression with a Kronecker product kernel $Y \in \{0,1\}^{n_1 \times n_2}$ 5. Given training Gram matrices $Y \in \{0,1\}^{n_1 \times n_2}$ 6 and $Y \in \{0,1\}^{n_1 \times n_2}$ 7 for leaf instances, eigendecompositions $Y \in \{0,1\}^{n_1 \times n_2}$ 8 and $Y \in \{0,1\}^{n_1 \times n_2}$ 9 produce the optimal weight matrix:

$n_1$ 0

Predictions for test batch similarity matrices $n_1$ 1, $n_1$ 2 are:

$n_1$ 3

This structure allows efficient, batch-wise computation for any leaf-level domain sizes (Ilídio et al., 16 Nov 2025).

6. Empirical Results and Performance Benchmarks

Oxytrees were evaluated on 15 bipartite datasets, including drug–nuclear receptor, kinase inhibitors, and lncRNA–disease, with sizes ranging from $n_1$ 4 to $n_1$ 5 and densities 1–20%. Baselines included MLPs on concatenated features, local models (RLS-avg, BLMNII, WkNNIR), RLS-Kron, NRLMF, and BICTR.

Validation encompassed instance-wise + dyad-wise splits yielding transductive (TD), semi-inductive (LT/TL), and fully inductive (TT) test sets; positive-unlabeled (PU) masking at 0–75%.
Metrics included AUROC, AUPRC, and Friedman + Nemenyi tests.

Key outcomes:

Predictive performance: Oxytrees match or outperform BICTR and RLS-Kron, notably in inductive scenarios.
Efficiency: Oxytrees train %%%%46 $Y \in \{0,1\}^{n_1 \times n_2}$ 547%%%% faster and predict %%%%48 $X_1 = \{x_1^i\}_{i=1}^{n_1} \subseteq \mathbb{R}^{m_1}$ 049%%%% faster than BICTR (p < $n_2$ 0).
Model parsimony: Oxytrees reach 98% accuracy with $n_2$ 142% fewer trees due to expressive leaves.
Proxy mechanism and leaf model ablation confirm that these components are crucial; removing leaf models degrades performance.
Robustness: Maintained accuracy under PU-masking and with increased minimum leaf size (Ilídio et al., 16 Nov 2025).

7. Software Implementation and Reproducibility

A Python package "bipartite_learn" provides the Oxytrees framework, including:

OxytreeClassifier class with fit(X1, X2, Y), predict(X1, X2), predict_proba(), and score()
hyperparameters for number of trees, minimum leaf size, feature subsample size, impurity, regularization ( $n_2$ 2), and kernel functions $n_2$ 3, $n_2$ 4
efficient batch inference
access to all 15 preprocessed datasets and evaluation/PU-masking utilities

All results are reproducible via open-source code, with tutorials and Jupyter notebooks available at https://github.com/pedroilidio/oxytrees2025.

Component	Characteristic	Complexity/Efficiency
Split Scoring	Proxy row/col matrices ( $n_2$ 5)	$n_2$ 6
Leaf Model	Kronecker-kernel RLS	Efficient per-leaf SVD
Inference	Batch, single-pass per test set	$n_2$ 7

Oxytrees thereby constitute an interpretable, fast, and accurate methodology for inductive bipartite learning, combining proxy-based biclustering, expressive kernelized leaf modeling, and scalable batch inference (Ilídio et al., 16 Nov 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Oxytrees: Model Trees for Bipartite Learning (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Oxytrees.