ExplainerFC-Expert Model
- ExplainerFC-Expert is a post-hoc, model-agnostic method that generates both factual and counterfactual explanations for expert search, team formation, and classification tasks.
- It integrates local explanatory techniques like SHAP and IME with advanced feature construction and pruning strategies to ensure scalable, high-fidelity explanations.
- Empirical results reveal up to 100× speedup and enhanced model accuracy, validating its practical utility in financial analysis and collaborative network datasets.
The ExplainerFC-Expert Model (also referred to as EFC-Expert) is a post-hoc, model-agnostic methodology for generating interpretable factual and counterfactual explanations in expert search, team formation, and general classification contexts. It combines local explanation methods such as SHAP or IME with feature construction algorithms, enabling actionable transparency in systems built on structured collaboration networks or attribute data. The approach is explicitly designed for scalability, conciseness of explanations, and rigorous evaluation of both explanation fidelity and constructed feature utility (Golzadeh et al., 2024, Vouk et al., 2023).
1. Mathematical Formulation and Problem Setup
Expert search and team formation systems operate on labeled graphs , where is the set of nodes (experts), encodes undirected collaborations, and is the set of skills attached to node . Queries specify required skills. The system responds via a ranking , or equivalently, a real-valued expert score per node, driving a binary classifier for “expert” or “team-member” membership.
For standard supervised learning tasks, the model is given a training set , with and . Explanations are generated for instances , typically focusing on a class of interest.
2. Factual Explanation Algorithms
A factual explanation identifies the minimal subset of input features most responsible for a node being classified as an expert, or an instance receiving a particular prediction. Features include query skills, node-skill pairs in the local -hop neighborhood , and edges among local collaborators.
Factual explanation construction proceeds by aggregating saliency scores derived from instance-based explanation methods such as SHAP (Golzadeh et al., 2024, Vouk et al., 2023). Features with highest absolute are prioritized; selection is constrained to those whose removal would minimally decrease the overall score:
for some threshold . Algorithmically, this involves constructing feature lists, restricting computation to , assigning SHAP scores to skills and edges, then extracting the top- skill and edge features.
In EFC-Expert feature construction (Vouk et al., 2023), factual aggregation extends to attribute groups: for each instance, attributes are ranked by normalized explanation magnitude, active groups are selected to reach cumulative thresholds , and frequent co-occurring groups across are retained for operator-based feature generation.
3. Counterfactual Explanation Framework
Counterfactual explanations seek minimal perturbations to the input—addition/removal of skills, collaborations, or query augmentations—that result in a label flip for , either to demote or promote an expert.
The optimization is standard minimum-cardinality counterfactual search:
Candidate generation is performed over a pruned set of features (see Section 4), with exponential search complexity mitigated using beam search heuristics. The procedure limits the number of candidate features to , the beam width to , the depth of search to , and the number of required explanations to . This approach yields explanations and actionable recommendations, such as suggesting skill acquisition or collaboration adjustments to become (or cease to be) an expert (Golzadeh et al., 2024).
4. Pruning Strategies and Computational Optimizations
Both factual and counterfactual explanation construction employ several pruning heuristics to ensure scalability:
- Network locality: Restrict skills and edges considered to -hop local neighborhoods.
- Influential collaborations: Use SHAP magnitude thresholds to prioritize edges, only expanding neighbors with sufficiently high attribution.
- Word-embedding filtering: Apply a word2vec model trained on the corpus to select semantically proximate skill additions for positive counterfactuals, or semantically distant skills for negative interventions.
- Link-prediction filtering: Use a Graph AutoEncoder (GAE) for edge suggestions/removals, focusing only on top- probable or most influential collaborations.
- Beam search: Bounds computational cost to , a marked improvement over naive exponential enumeration.
In feature construction for general classification tasks, EFC-Expert further prunes by frequency thresholding (“noiseThr”), limiting group sizes ( tractable), and discarding low-utility features via MDL or impurity scores (Vouk et al., 2023).
5. Constructive Operators for Feature Induction
The EFC-Expert methodology supports a comprehensive suite of constructive operators for enriched feature creation from detected attribute groups (Vouk et al., 2023). These include:
- Logical operators: AND, OR, equivalence, XOR, implication, applied after discretization.
- Relational operators: Pairwise comparisons, numeric thresholds.
- Cartesian-product operators: Conjunctions over categorical values.
- Numerical operators: Sums, products, pairwise quotients.
- Threshold operators: num-of-N, X-of-N, all-of-N, M-of-N semantics for logic aggregation.
Rule-based features extracted via decision-rule learners (e.g., FURIA restricted to attribute groups) and thresholded rule compositions are also supported.
6. Empirical Evaluation and Practical Use Cases
Evaluation on collaboration networks (DBLP, GitHub), synthetic benchmarks, UCI classification datasets, and financial credit-scoring tasks demonstrates:
- Speedup: Pruning reduces factual explanation latency from 50–200s to ≤5s and counterfactual explanation from ≥800s to ≤100s—10×–100× faster than exhaustive search, with negligible loss of explanation quality (≥90% agreement with minimal sets) (Golzadeh et al., 2024).
- Accuracy and interpretability: EFC-Expert features increase accuracy for interpretable models (DT, NB, SVMs); notably, enriched Naïve Bayes and decision-trees match or outperform XGBoost in several datasets (Vouk et al., 2023).
- Domain expert validation: In financial datasets, constructed features rediscover standard financial ratios and introduce interpretable conjunctive/arithmetic rules, validated by domain experts (Spearman , ).
- Feature comprehensibility: Generated features retain human interpretability and align with expert-driven logic in practical deployments.
7. Significance and Outlook
ExplainerFC-Expert synthesizes the strengths of local explanation methods and structured feature construction, operationalizing explanation in expert search, team formation, and broad classification. Its pruning and operator strategies render otherwise intractable searches feasible at interactive speeds, supporting both transparency and substantive model improvement (Golzadeh et al., 2024, Vouk et al., 2023). The codebase enables ongoing research in expanding the operator library, optimizing feature selection, and specializing explanations for domain-specific graphs and structured data.
A plausible implication is that future expert systems can leverage these explanation frameworks for both diagnostics and actionable user guidance, enhancing trust, adaptability, and performance in network-based and attribute-driven applications.