Decision Tree Protocols

Updated 7 January 2026

Decision tree protocols are formally structured rules that leverage branching tree models for efficient decision-making and prediction.
They combine classical impurity-based methods with statistical tests and cryptographic techniques like MPC and FHE to enhance privacy and performance.
Applications range from secure machine learning and distributed inference to communication protocols and domain-specific decision support systems.

A decision tree protocol is a formally specified set of rules, data structures, or cryptographic procedures for decision-making or information processing that leverages the branching structure of a decision tree. The term spans classical algorithms for data partitioning, protocols for distributed or privacy-preserving computation, multi-user coordination over communication channels, and frameworks that encode domain knowledge or microeconomic preferences into tree-structured decision rules. In addition to classical top-down learning methods (e.g., ID3, CART), decision tree protocols encompass secure evaluation protocols (PDTE), MPC and FHE-based training/evaluation, advanced split criteria (e.g., statistical subgroup testing), and coding/communication protocols for distributed or adversarial environments.

1. Formal Definitions and Structural Properties

A decision tree protocol consists of a rooted, directed tree $T=(V,E)$ in which each internal node $v\in V$ is associated with a test function $f_v:\mathcal{X}\to\{\text{yes},\text{no}\}$ (for binary trees), and each leaf $\ell$ carries a terminal action, prediction, or value. Traversal operates as follows: given $x\in\mathcal{X}$ , traverse the tree from the root, routing at node $v$ on the outcome of $f_v(x)$ , until a leaf is reached. The protocol may be

Algorithmically induced (from impurity-minimization, as in (Hamada et al., 2021, Cheng et al., 16 Sep 2025)),
Hand-curated (from domain guidelines, as in (Sinha et al., 2018)),
Statistically optimized via model selection (Cheng et al., 16 Sep 2025),
Cryptographically structured for privacy (see §4).

The formalism supports both axis-aligned (“if $x_k\leq t$ ”) and more general conjunctive/disjunctive branching (see (Brathwaite et al., 2017)): a decision protocol may be a “disjunction-of-conjunctions” of primitive feature predicates, characterizing non-compensatory rationality models.

2. Learning, Splitting, and Stopping Criteria

Classical and Secure Training

Protocols for tree construction may be evaluative/impurity-driven or statistically principled:

Classical (impurity-based): At each node, scan candidate partitions and select the split maximizing impurity reduction (e.g., Gini, entropy) (Hamada et al., 2021). In privacy-preserving MPC, grouping and sorting can be implemented over secret-shared data, reducing round complexity (Hamada et al., 2021).
Statistical subgroup testing: ZTree (Cheng et al., 16 Sep 2025) replaces impurity with hypothesis tests (e.g., $z$ -test, $t$ -test, Mann–Whitney $U$ ), selecting splits whose cross-validated test statistic (corrected for multiple testing by internal $K$ -fold cross-validation) exceeds a user-controlled threshold $t$ . Explicit formulas for $z$ -scores and cross-validated estimation are specified; post-pruning is obviated and tree complexity is controlled solely by $t$ .
Expert-driven: In medical care protocols (Sinha et al., 2018), trees are pre-built from guidelines with no algorithmic split criterion; updates are applied manually via expert review.

Stopping, Pruning, and Model Selection

Traditionally, post-pruning is employed to control overfitting; in ZTree (Cheng et al., 16 Sep 2025), internal CV-based multiplicity correction renders post-pruning unnecessary, and parameter $t$ can be dialed to finely tune tree depth and complexity post hoc, with all simpler trees formed by node thresholding.

Protocol	Split Criterion	Stopping/Complexity Control
CART (Hamada et al., 2021)	Gini/entropy impurity	min samples/complexity pruning
ZTree (Cheng et al., 16 Sep 2025)	$z$ -statistic (statistical test)	CV $z$ -threshold, monotonicity
Care protocol (Sinha et al., 2018)	Human guidelines	Manual/expert review

3. Decision Tree Protocols in Distributed, Secure, and Adversarial Computation

A major area of innovation involves protocols for secure, privacy-preserving, or communication-efficient decision tree evaluation and training. These include:

MPC-based training (Hamada et al., 2021): Datasets are secret-shared among $n$ parties, aggregates such as group sums, maxima, and prefix-sums are computed securely to find optimal splits. The protocol improves from $O(2^h m n\log n)$ to $O(h m n\log n)$ comparisons (for $h$ tree height) by leveraging secure grouped data structures, avoiding the need to inject dummy rows.
Private Decision Tree Evaluation (PDTE): Protocols using FHE, HE, OT, or RSS realize a server-client model where model parameters/structure and user feature vectors remain private. For example:
- Level Up (Mahdavi et al., 2023): Leveled HE with non-interactive XXCMP/RCC comparison gadgets; subtree traversal via SumPath yields efficient protocols for trees with $>1000$ nodes, supporting arbitrary precision.
- Kangaroo (Xu et al., 3 Sep 2025): High-throughput, constant-round, amortized protocols using BFV homomorphic encryption, SIMD ciphertext packing, and batch feature/path evaluation, offering $\sim 60$ ms per tree (amortized) for forests with $>400{,}000$ nodes in WAN conditions.
- Sublinear PDTE (Bai et al., 2022, Bai et al., 2023): Protocols achieve $O(d)$ (not $O(2^d)$ or $O(m)$ ) communication via oblivious selection (SOS, OS) over tree arrays, leveraging cryptographic primitives (PRF/AHE, DPF+RSS) and secret-sharing. Malicious security in honest-majority 3PC is attained (Bai et al., 2023).
- Level-site partitioning (Quijano et al., 4 May 2025): For secure outsourced inference on deep or sparse trees, each level is processed by a distinct “level-site” entity. Protocol mitigates timing/side-channel leakage, improves runtime, and supports multi-cloud deployment.
- Non-interactive (single-round) protocols (Tueno et al., 2019, Mahdavi et al., 2023): FHE-based non-interactive evaluation with branching executed algebraically; SumPath and ciphertext packing are standard patterns.

Protocol	Model Privacy	Input Privacy	Communication	Security Model
Level Up (Mahdavi et al., 2023)	Yes	Yes	Sublinear	Semi-honest, FHE
Kangaroo (Xu et al., 3 Sep 2025)	Yes	Yes	Constant-round, amortized	Semi-honest, HE
Mostree (Bai et al., 2023)	Yes	Yes	Sublinear	Malicious, 3PC
Level-site (Quijano et al., 4 May 2025)	Yes	Yes	$O(d)$	HBC, no collusion
Sublinear PDTE (Bai et al., 2022)	Yes	Yes	$O(d)$	Semi-honest, 2PC

A plausible implication is that the trend in PDTE protocol design is toward sublinear, constant-round, and batch-amortized computation, with increasing practical efficiency and modifiable trust assumptions.

4. Decision-Tree-Based Protocols in Communication and Multiple-Access

Decision tree protocols are widely used in communication-theoretic and multi-user systems:

Splitting tree protocols (Sørensen et al., 2013): In classical tree-based multi-access schemes, collision events in slotted channels drive node splitting, yielding a decision tree structure. Coded Splitting Tree Protocols (CSTP) extend this by running $K$ partial trees in parallel, mapping unresolved collisions to a bipartite Tanner graph analyzed by belief-propagation/SIC. The protocol's core decision is the order of splitting, optimized by shaping the leaf-degree distribution for favorable SIC.
Throughput and complexity: The CSTP achieves up to $\sim 0.80$ throughput (vs. $0.69$ for ideal SIC-enhanced classical splitting); this comes at higher feedback overhead and receiver complexity, but the protocol generalizes tree splitting into a code structure for more efficient information resolution.

5. Decision Tree Protocols in Descriptive and Prescriptive Decision Theories

Decision tree protocols serve as formalizations of behavioral or microeconomic choice. In (Brathwaite et al., 2017):

Disjunctions-of-Conjunctions Protocol: Decision trees encode rules of the form “if any of $D$ conjunctive criteria is satisfied, consider the alternative.” Mathematically, $P(\text{consider}\mid x) = I[\sum_{i=1}^D \prod_{k} I\{x_k\in S_{ik}\} \ge1]$ .
Semi-compensatory models: Two-stage frameworks employ trees for non-compensatory screening (consideration set formation), then a compensatory model (e.g., multinomial logit) over the surviving alternatives.
Bayesian model trees: Trees are endowed with priors over structure and parameters, enabling quantification of heterogeneity and model uncertainty. This protocol finds practical and theoretical application in discrete choice modeling, e.g., predicting sharp behavioral thresholds in transport mode choice (Brathwaite et al., 2017).

This line of work establishes decision trees as expressive formal decision protocols that subsume classical rules, providing interpretability and statistical rigor.

6. Communication Protocols, Compression, and Function Complexity

Decision tree protocols play a key role in information theory and computational complexity:

Fourier Entropy-Influence Conjecture (Wan et al., 2013): There exists a communication protocol (based on decision tree structure) for sampling from the spectral distribution of a Boolean function $f$ , with average description length bounded by $C \cdot \operatorname{Inf}[f]$ . For read- $k$ trees, $H(\widehat{f}^2)\leq 9k \operatorname{Inf}[f]$ ; for expected-depth- $d$ trees, $H(\widehat{f}^2)\leq 12d\operatorname{Inf}[f]$ .
Encoding protocols: Recursive decision tree–based prefix-free codes achieve the above bounds, and improvements in these tree-based protocols would yield progress on the FEI conjecture.

The communication-protocol view provides a unified language connecting decision tree complexity, information transmission, and Boolean function analysis.

7. Decision Tree Protocols in Domain-Specific Applications

Domain-directed tree protocols serve as adaptive, interactive mechanisms in decision support systems:

Contextual care protocols (Sinha et al., 2018): A neural network first thresholds probability of diseases, then for each the corresponding human-curated decision tree is traversed for recommendations. Trees are updated via physician corrections, with clustering/grouping of feedback for scalable expert review.
Domain-directed dialogs ([0703072]): While details are not reported in the provided data, the use of automatically generated dialog workflows via decision trees—augmented by user feedback for optimized dialog path cost—is stated.

A plausible implication is that in complex, evolving domains, decision tree protocols support both automated and expert-overridable operational pipelines.

References: Selected cited works—(Sinha et al., 2018, Sørensen et al., 2013, Brathwaite et al., 2017, Wan et al., 2013, Hamada et al., 2021, Mahdavi et al., 2023, Xu et al., 3 Sep 2025, Tueno et al., 2019, Bai et al., 2023, Cheng et al., 16 Sep 2025, Quijano et al., 4 May 2025, Bai et al., 2022).

Summary: Decision tree protocols encapsulate a spectrum of formal mechanisms for decision-making, communication, privacy preservation, and behavioral description. They are central to current research in secure and distributed machine learning, communication systems, and microeconomic modeling; ongoing work focuses on enriching their statistical rigor, interpretability, and efficiency under stringent privacy and communication constraints.