Papers
Topics
Authors
Recent
Search
2000 character limit reached

Doc2AHP: LLM-Enhanced AHP for Decision Modeling

Updated 30 January 2026
  • Doc2AHP is a structured inference framework that combines the semantic power of LLMs with the formal hierarchy and numerical rigor of AHP for decision modeling.
  • It utilizes semantic tree construction, multi-agent collaboration, and adaptive consistency optimization to generate decision hierarchies, robust pairwise weights, and alternative rankings.
  • The framework eliminates the need for manual expert annotation and achieves high accuracy and strict numerical consistency in benchmark evaluations.

Doc2AHP is a structured inference framework that integrates the generalization capacity of LLMs with the formal rigor of the Analytic Hierarchy Process (AHP) to enable automated, interpretable multi-criteria decision modeling from unstructured documents. By leveraging semantic tree construction, multi-agent collaboration, and adaptive consistency optimization, Doc2AHP generates decision hierarchies, computes robust pairwise criteria weights, and synthesizes alternative rankings—all while enforcing logical entailment and axiomatic numerical constraints intrinsic to classical AHP. This methodology eliminates the dependency on manual expert annotation and annotated training data, thus addressing scalability and reliability barriers inherent in generic LLM-based decision modeling (Wu et al., 23 Jan 2026).

1. Motivation and Theoretical Foundation

Doc2AHP is motivated by the structural and numerical weaknesses observed in generic LLM outputs when tasked with decision modeling. LLMs, while adept at semantic extraction, frequently produce criteria and pairwise judgements that lack document grounding and violate formal decision-theoretic axioms, leading to hallucinated, incoherent outputs. In contrast, AHP offers a systematic approach: it decomposes decision problems hierarchically and employs pairwise comparisons using a fixed scale (aij[1,9]a_{ij}\in[1,9]), with weights computed via eigendecomposition and consistency indices:

CI=λmaxnn1,CR=CIRICI = \frac{\lambda_{\max} - n}{n-1}, \quad CR = \frac{CI}{RI}

Here, λmax\lambda_{\max} is the principal eigenvalue of the comparison matrix, nn is the matrix dimension, and RIRI the random index. By requiring CR0.1CR\le 0.1, AHP enforces transitivity and numerical reliability. Doc2AHP bridges the strengths of both paradigms by imposing these structural and numerical constraints on LLM-driven inference, yielding verifiable decision models (Wu et al., 23 Jan 2026).

2. Framework Architecture and Workflow

Doc2AHP comprises two sequential phases:

Phase I: Probabilistic AHP Construction

  • Structure Generation: Semantic embeddings are computed at the paragraph level for each document. Ward’s hierarchical clustering yields a semantic tree pruned under cognitive constraints (maximum branching KmaxK_{\max}, depth DmaxD_{\max}, semantic verification threshold τ\tau).
  • Weight Estimation: A Leader-Guided Multi-Agent Collaboration mechanism recruits KK Domain Expert Agents, each generating pairwise matrices M(k)M^{(k)}. Their outputs are aggregated by weighted geometric mean and projected into AHP-consistent weight space via constrained optimization.

Phase II: Decision Inference

For each alternative aka_k and leaf criterion cjc_j, the LLM participates in local utility estimation:

skj=Eypθ(cj,ak,D)[Mscore(y)]s_{kj}=\mathbb{E}_{y\sim p_\theta(\cdot|c_j,a_k,D)}\bigl[\mathcal{M}_{\mathrm{score}}(y)\bigr]

Aggregated utility scores are computed:

U(ak)=jwjskjU(a_k) = \sum_j w_j s_{kj}

This complete pipeline ensures interpretability from raw documents DD through hierarchy H\mathcal{H}, weights w\mathbf{w}, to alternative scores U(ak)U(a_k) (Wu et al., 23 Jan 2026).

3. Semantic Tree Generation and Hierarchy Mapping

Semantic tree construction commences with embedding paragraphs (pi,jp_{i,j}) from documents (did_i) into Rk\mathbb{R}^k vectors (vi,j\mathbf{v}_{i,j}), followed by Ward’s method to build a hierarchical tree T=(N,E)\mathcal{T}=(\mathcal{N},\mathcal{E}). Top-down recursive pruning, documented in Algorithm 1 of the source, yields an AHP hierarchy (H\mathcal{H}):

  • Root criterion (c0c_0) is attached to the tree root.
  • At each node (depth d<Dmaxd<D_{\max}), the subtree is split into sub-clusters (2mKmax2 \leq m \leq K_{\max}), maximizing semantic separation.
  • Each sub-cluster is summarized via LLM into a criterion label (ciLLMgen(Text(ui)Pui)c_i \leftarrow \mathrm{LLM}_{\mathrm{gen}}(\text{Text}(u_i)|\mathcal{P}_{u_i})), followed by entailment verification (LLMverify(ci,c)τ\mathrm{LLM}_{\mathrm{verify}}(c_i,c) \geq \tau).
  • Links that pass semantic verification are recursively explored.

The resultant tree respects cognitive constraints and grounds criteria/subcriteria labels in document semantics (Wu et al., 23 Jan 2026).

4. Multi-Agent Judgement and Consensus Aggregation

Upon establishing the hierarchy, pairwise comparisons among sibling criteria are solicited from KK expert agents. Each agent kk generates a matrix M(k)=[aij(k)]M^{(k)} = [a^{(k)}_{ij}] with aij(k)[1,9]a^{(k)}_{ij}\in[1,9]. Aggregation uses a weighted geometric mean:

aˉij=k=1K(aij(k))γk,k=1Kγk=1\bar{a}_{ij} = \prod_{k=1}^K (a^{(k)}_{ij})^{\gamma_k}, \quad \sum_{k=1}^K \gamma_k = 1

Weights γk\gamma_k typically default to $1/K$ unless modified by the Leader Agent based on domain expertise. The resulting consensus matrix Mˉ\bar{M} is then processed for consistency (Wu et al., 23 Jan 2026).

5. Adaptive Consistency Optimization

Doc2AHP applies convex Logarithmic Least Squares (LLS) optimization to project Mˉ\bar{M} into the valid AHP space, incorporating leader-imposed domain constraints Ωleader\Omega_{\text{leader}}:

w=argminwi=1nj=1n(lnaˉijlnwiwj)2 s.t. i=1nwi=1, wi>0, wiβijwj,  (i,j,β)Ωleader\begin{aligned} \mathbf{w}^* = \arg\min_{\mathbf{w}} \sum_{i=1}^n \sum_{j=1}^n \left( \ln \bar{a}_{ij} - \ln \frac{w_i}{w_j} \right)^2 \ \text{s.t.}\ \sum_{i=1}^n w_i = 1,\ w_i > 0, \ w_i \geq \beta_{ij} w_j,\ \forall\ (i,j,\beta)\in\Omega_{\text{leader}} \end{aligned}

The optimization yields w\mathbf{w}^* with:

CI=λmaxnn1, CR=CIRI0.1CI = \frac{\lambda_{\max} - n}{n-1},\ CR = \frac{CI}{RI} \leq 0.1

If discretization is necessary, the ratios {wi/wj}\{w_i/w_j\} are rounded to the nearest AHP admissible value (Wu et al., 23 Jan 2026).

6. Empirical Results and Validation

Doc2AHP was evaluated using DecisionBench, a suite of six decision scenarios built atop IMDb, HotelRec, and Beer Advocate datasets, each presenting 20 candidate alternatives. Baseline comparisons include Standard-AHP (single-agent, without consistency enforcement) and Debate-AHP (multi-agent negotiation without formal constraints). Metrics include ranking accuracy (NDCG@5, NDCG@10), numerical reliability (CRmaxCR_{\max}, CRmeanCR_{\mathrm{mean}}), and pass rate Pr[CR<0.1]\Pr[CR<0.1].

Key findings:

  • Doc2AHP achieved top NDCG@5 in five of six tasks (e.g., 0.854 vs. 0.830 Standard, 0.777 Debate in "Narrative Drama").
  • Maintained 100% pass rates for CR<0.1CR<0.1 across model variations (Llama-8B, Llama-70B, GPT-5.2); baselines ranged as low as 0%.
  • Ablation studies indicated the critical impact of semantic structuring and consistency optimization on ranking quality and numerical rigor (Wu et al., 23 Jan 2026).

7. Discussion, Applications, and Future Directions

Doc2AHP demonstrates the viability of combining AHP’s formal, auditable scaffolds with LLM semantic generalization to elevate decision modeling above black-box intuition-based prompting toward verifiable, logically consistent outputs. The recursive semantic construction and optimization incur greater computational cost but are justified in high-stakes domains (e.g., medical, security) where reliability is paramount. For low-risk settings, simpler LLM-based methods may suffice.

Potential future developments include:

  • Adaptive pruning for scalability across large document corpora.
  • Cross-domain generalization of semantic tree-to-criterion mappings.
  • Incorporation of human-in-the-loop feedback for dynamic updating of leader constraints Ωleader\Omega_{\text{leader}}.

A plausible implication is broader applicability of Doc2AHP for enabling non-expert, scalable decision modeling, facilitating transparent and auditably rational decision processes in diverse research and application contexts (Wu et al., 23 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Doc2AHP.