Papers
Topics
Authors
Recent
2000 character limit reached

Query-Conditioned Deterministic Inference Networks

Updated 14 November 2025
  • QDINs are modular architectures that deterministically map structured queries and observed states to specialized answer spaces using advanced encoding and attention-based fusion.
  • They integrate state and query encoders with specialized inference heads (reachability, path, comparison, policy) to efficiently extract actionable, interpretable outputs.
  • Empirical results show that mixed-objective training balances precise inference and robust control, achieving high performance in both structured query answering and reinforcement learning tasks.

Query-Conditioned Deterministic Inference Networks (QDIN) are a class of architectures designed to answer families of structured queries about probabilistic or dynamical systems in a deterministic, efficient, and interpretable manner. Originating from the intersection of probabilistic graphical models, operator-theoretic learning, and reinforcement learning, QDINs generalize traditional inference networks by parameterizing the entire mapping from a query (specifying conditioning, question type, and parameters) and an observed state to the corresponding answer. In deterministic reinforcement learning, QDINs reimagine agents as modular, queryable knowledge bases: rather than only selecting actions, the agent can efficiently answer questions such as reachability, path extraction, state comparisons, or policy queries "on demand."

1. Formal Definition and Architectural Framework

Let 𝒮 denote the (possibly high-dimensional) state space and 𝒬 the set of structured queries. For each query qQq\in\mathcal{Q}, there is an associated answer space Yq\mathcal{Y}_q. A QDIN is defined as a parameterized function:

fθ:S×QYf_\theta: \mathcal{S} \times \mathcal{Q} \to \mathcal{Y}

implemented by composing a state encoder, a query encoder, and a fusion mechanism (often attention-based) feeding into specialized, query-specific inference heads.

a. State and Query Encoding:

  • Multi-scale convolutional layers extract hierarchical features from the input state, e.g., for an agent in a spatial grid:
    • h1=Conv3×3(s;32)h_1 = \mathrm{Conv}_{3\times3}(s;32), h2=Conv3×3(h1;64)h_2 = \mathrm{Conv}_{3\times3}(h_1;64), h3=Conv3×3(h2;128)h_3 = \mathrm{Conv}_{3\times3}(h_2;128).
  • Query q=(type,params)q = (\text{type}, \text{params}) is mapped to an embedding:
    • Discrete type via learned lookup table; params through an MLP; the concatenated embedding is layer-normalized (R80\mathbb{R}^{80}).

b. Query–State Fusion:

  • A single-head cross-attention module (as in Vaswani et al. 2017) allows the query embedding to attend over spatial state features, producing hfusedh_\mathrm{fused}.

c. Specialized Heads:

Given the query type, hfusedh_\mathrm{fused} is routed to the corresponding inference head:

  • Reachability: Produces a spatial mask R^H(s)\hat{R}_H(s) denoting states reachable in HH steps.
  • Path: LSTM-pointer network extracts explicit waypoints and estimated distances.
  • Comparison: MLP-based absolute difference classifier for relative queries among goals.
  • Policy: Standard action distribution as softmax for direct control queries.

This modular structure ensures (s,q)\forall (s,q), fQDIN(s,q)=Headq.type(hfused(s,q))Yq.typef_\mathrm{QDIN}(s,q) = \operatorname{Head}_{q.\mathrm{type}}(h_\mathrm{fused}(s,q)) \in \mathcal{Y}_{q.\mathrm{type}}.

2. Training Objectives and Loss Structure

QDINs are trained with a multi-objective loss that jointly optimizes for control and various query-answering capacities:

L(θ)=αcontrolLTD+qQαqLq+λLconsistency\mathcal{L}(\theta) = \alpha_\mathrm{control} \mathcal{L}_\mathrm{TD} + \sum_{q\in\mathcal{Q}} \alpha_q \mathcal{L}_q + \lambda \mathcal{L}_\mathrm{consistency}

  • LTD\mathcal{L}_\mathrm{TD}: Temporal-difference loss for value or policy learning.
  • Lq\mathcal{L}_q: Losses specialized to each query type (binary cross-entropy per pixel for reachability, composite CE-MAE for paths, etc.).
  • Lconsistency\mathcal{L}_\mathrm{consistency}: Enforces logical relations (e.g., consistency between reachability and path length).
  • Each per-query loss is normalized by an exponential moving average, mitigating head dominance during early training phases.

This structure explicitly promotes learning distinct, query-specific representations alongside traditional control, enabling high accuracy for diverse inference patterns.

3. Specialized Architectures in Practice

QDIN architecture is strongly modular:

Module Inputs / Features Output / Functionality
State Encoder ss via 3 Conv layers, skip from h1h_1 h1,h2,h3h_1, h_2, h_3 (multi-scale features)
Query Encoder type (emb.), params (MLP), layernorm hqh_q (R80\mathbb{R}^{80} query embedding)
Fusion Cross-attention: Q=hq, K,V=hsQ=h_q,\ K,V=h_s hfusedh_\mathrm{fused}, shape: W×H×CW\times H\times C
Heads hfusedh_\mathrm{fused}, additional params Query-dependent output
  • Reachability head: Two ConvTranspose blocks, skip connection, outputs mask, sigmoid activation.
  • Path head: LSTM with pointer attention across spatial grid, predicts waypoint sequence plus distance.
  • Comparison head: Small MLP of size 6432164\rightarrow32\rightarrow1.
  • Policy head: Linear layer maps to A|\mathcal{A}| actions with softmax.

Each module is optimized for its query-specific inference pattern. Deactivating or ablating these modules leads to significant degradation in the corresponding metrics (e.g., removing specialized heads decreases reachability IoU by 0.18 and increases path MAE by +5.1).

4. Inference–Control Representation Decoupling

A central empirical observation of QDINs in deterministic (e.g., grid-world) environments is the pronounced separation between accurate world inference and optimal policy learning:

  • Training solely on inference (no control loss, αcontrol=0\alpha_\mathrm{control}=0) yields peak reachability mask accuracy (IoU 0.99\approx 0.99), with poor navigation returns (0.31\approx 0.31).
  • Control-only training (αq=0\alpha_q=0) achieves high navigation return (0.89\approx 0.89), but low reachability IoU (0.72\approx 0.72).
  • Mixed-objective QDINs nearly recover both (IoU =0.97=0.97, return =0.82=0.82).
Training Mode Reach IoU Path MAE Comp Acc Policy Acc Return
Control-Only 0.72±0.03 8.4±0.5 0.74±0.02 0.81±0.02 0.89±0.03
Query-Only 0.99±0.01 1.2±0.1 0.92±0.01 0.43±0.04 0.31±0.05
Mixed (Ours) 0.97±0.01 2.1±0.2 0.88±0.02 0.76±0.02 0.82±0.03

These results (see (Zakershahrak, 11 Nov 2025), Table 1) indicate that the representations sufficient for exact world-structure inference are not necessarily those that support high reward. This decoupling implies QDINs can serve as accurate knowledge bases irrespective of their control proficiency.

5. Empirical Performance and Comparative Evaluation

QDINs, evaluated across ablations, compositional generalization, inference efficiency, and calibration, show superior performance versus unified or post-hoc extraction baselines:

  • Ablation: Removing specialized heads or cross-attention causes significant drops in reachability IoU, path accuracy, and return.
  • Generalization: QDINs achieve 73% zero-shot accuracy on composite queries, compared to 41% for monolithic architectures.
  • Efficiency: QDINs answer reachability in 5–10 ms at 0.97 IoU; A* search requires 100–200 ms for exact solutions.
  • Calibration: Temperature scaling produces ECE =0.031=0.031; selective answering allows 95% accuracy at 80% coverage.

This suggests specialized, query-conditioned designs are more robust and efficient than sharing the same representation for all query types or extracting answers from monolithic control policies.

6. Interpretability, Verification, and Modularity

QDINs provide directly interpretable outputs for each query type:

  • Reachability masks offer explicit spatial visualization of possible agent locations within a horizon.
  • Path outputs enumerate step-wise plans as human-understandable trajectories.
  • Comparisons yield meaningful probabilistic preferences between alternative goals.
  • Standard policies retain RL compatibility.

The modularity of QDINs supports formal safety verification (e.g., checking that unsafe states do not appear in reachable sets) and interactive human–AI teaming—for example, a user querying whether an agent can reach a location within a specified budget and instantly observing either path or coverage mask. Extensions to temporal logic or probabilistic query types are enabled by simply adding new neural modules, without model reengineering.

QDINs generalize and unify previous approaches to deterministic and probabilistic inference:

  • The structure mirrors the Query DAG (Q-DAG) approach for Bayesian networks, which compiles possible inferences into a single optimized data structure, enabling fast evaluation and update (Darwiche et al., 2013).
  • In undirected graphical models, unrolled inference networks (e.g., QT-NN) answer arbitrary subsetting queries over observed variables, sidestepping intractable partition functions and amortizing inference via deterministic mappings (Lazaro-Gredilla et al., 2019).
  • From an operator-theoretic perspective, QDINs embody the paradigm of learning a conditional expectation operator in a finite basis: encoding arbitrary queries as coefficients and producing answers by a single forward pass and inner product, as characterized in Neural Conditional Probability (Kostic et al., 1 Jul 2024).

This modularity, combined with direct end-to-end training for diverse queries, positions QDINs as a general substrate for compositional, fast, and interpretable conditional inference, with applications ranging from explainable reinforcement learning to probabilistic reasoning and uncertainty quantification.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Query Conditioned Deterministic Inference Networks (QDIN).