FOLD-TR: Explainable Inductive Rank-Learner
- The paper presents FOLD-TR’s integration of pairwise ranking with explainable ILP using ASP proof trees to generate stratified logic programs.
- It encodes numerical differences and categorical matches from rank-adjacent samples, applying information gain to induce precise ranking rules.
- FOLD-TR delivers native explainability by providing detailed proof trees that transparently justify each ranking decision.
The Logic Program Inductive Rank-Learner (FOLD-TR) is a scalable and efficient inductive logic programming (ILP) algorithm for learning to rank, extending FOLD-R++ with a framework for explainable pairwise ranking over mixed-type (numerical and categorical) features. FOLD-TR generates a stratified normal logic program that not only predicts the order of new items according to patterns in the training data, but also provides native, human-interpretable explanations for its ranking decisions by leveraging answer set programming (ASP) proof trees (Wang et al., 2022).
1. Input Representation and Pairwise Encoding
FOLD-TR is designed to operate on ranking data represented as a ground-truth partial or total order over a set of items, each described by a fixed set of features. These features may be numerical or categorical. The ranking is transformed into a set of pairwise examples :
- Positive pairs (): Each ordered pair where should rank above is included,
- Negative pairs (): Optionally, reverse or other pairs such as , are sampled as negatives.
For each training pair , FOLD-TR uses a “plot” procedure to encode the pair into a feature vector of ground atoms suitable for logic program induction:
- For numeric feature , a difference literal is created: 0, where 1. In Prolog-style: 2.
- For categorical feature 3, two atoms are recorded: 4 and 5.
These literals form the candidate atoms from which FOLD-TR constructs normal logic program clauses, all with the head 6.
2. Ranking Objective, Loss Function, and Information Gain
The algorithm adopts a pairwise approach and seeks to minimize the 0–1 pairwise classification error over all generated pairs: 7 with 8 for positives and 9 for negatives.
In practical induction, FOLD-TR uses information gain (IG) as the primary split selection heuristic: 0 where 1 denotes binary entropy, and 2, 3 are splits of the data induced by the candidate literal.
3. Algorithm Overview and Distinctions from FOLD-R++
FOLD-TR encapsulates a customized FOLD-R++ within a pairwise training and sampling framework. Key steps include:
- Rank-adjacent sampling: Given a ranked list, FOLD-TR samples 4 items (with 5) from the training set, using a Normal distribution to prefer adjacent ranks. This reduces the quadratic expansion of all possible pairs to 6.
- Pairwise data plotting: The sampled items are expanded to positive and negative pairwise examples using the “plot” step outlined above.
- Rule induction (fold_rpp): The core ILP learner repeatedly induces logic program clauses that maximize information gain over the current set of training pairs, using a greedy, clause-wise approach. Each iteration adds a pair of literals (one numerical-difference, one categorical-equality) to specialize the rule body, halting when information gain drops to zero or negative-to-positive ratio falls below a threshold.
- Exception handling and recursion: If a rule overgeneralizes, an “abnormal” clause (exception) is induced for the uncovered negative pairs, forming a stratified, default-plus-exception structure.
- Termination: Induction terminates when all covered positives are explained or progress stalls.
Distinctive features compared to FOLD-R++ include:
- Literal selection operates on candidate pairs (numeric-difference plus categorical-equality) to optimally structure 7 hypotheses.
- Subsampling of pairs via normal sampling over adjacent ranks, avoiding full 8 pairwise expansion.
- The “plot” mapping, converting ranking pairs to logic-based feature differences and equalities.
4. Learned Program Structure and Default–Exception Clauses
The learned output of FOLD-TR is a stratified normal logic program, with each clause rendering a ranking rule. Clauses are composed as: 2 where 9 are literals corresponding to:
- Numeric difference comparison, e.g., 0.
- Categorical (in)equality, e.g., 1 or 2.
Negative conditions (3) correspond to abnormal predicates, recursively defining exceptions to overgeneralized defaults. Examples from the Boston Housing dataset demonstrate the clause hierarchy, where, for instance, 3 Each 4 captures an exception for the corresponding default case.
5. Scalability and Computational Considerations
To ensure efficiency on practical datasets, FOLD-TR introduces a set of optimizations:
- Prefix-sum trick: For threshold selection in numeric-difference literals, candidate thresholds are sorted once, followed by cumulative count updates, reducing time complexity per literal from 5 to 6, with 7 examples and 8 distinct feature values.
- Subsampling pairs: Limiting training to 9 rank-adjacent pairs via Normal rank-sampling focuses induction on difficult (nearby) comparisons, both reducing computation and mitigating class imbalance.
- Greedy literal addition and early stopping: New literals are added only if they yield non-negative information gain and the negative-positive ratio remains above threshold; induction halts otherwise.
- Exception recursion bounded by ratio threshold: The recursion for exception rules is constrained, limiting depth and runtime.
- Empirical complexity: Induction scales roughly linearly in 0 per clause.
Reported empirical results on standard UCI ranking datasets (Boston Housing, Wine Quality, Student Performance) indicate FOLD-TR learns 5–32 rules (21–147 predicates total) with pairwise accuracy 0.63–0.81 and high efficiency on commodity hardware.
6. Explainability and Native Justification
A central property of FOLD-TR is its native explainability. The output is a stratified logic program interpretable by goal-directed ASP solvers such as s(CASP). For any decision 1, the solver can generate a proof tree detailing:
- All clauses involved in the derivation,
- Grounded feature differences and categorical matches,
- Marked truth values for each literal, and
- Bypassed exceptions justified via negation-as-failure.
s(CASP) also supports natural-language justification, e.g.:
“better(A,B) holds because rm(A)–rm(B) ≤ 0.154 (holds), crim(A)–crim(B) ≤ -5.806 (holds), {rm(A)=6.575, rm(B)=5.887, crim(A)=0.00632, crim(B)=13.3598}”
This yields per-decision transparency, revealing which conditions directly led to the ranking and precisely why. Each prediction is traceable to its underlying logic program clause and instantiated feature values.
7. Significance and Application Scope
FOLD-TR adapts the FOLD-R++ framework to pairwise ranking, making significant advances in logic-based explainable ranking for mixed-type tabular data. Its scalability stems from principled sampling and low-complexity literal selection, while its expressiveness derives from the combination of default–exception clause structure and direct numerics/categorical handling. Its empirical results on common benchmarks, paired with logic program explainability, position it as a robust approach for domains requiring interpretable ranking over heterogeneous attributes (Wang et al., 2022).