Papers
Topics
Authors
Recent
2000 character limit reached

Dynamic Outline-Guided Agent (DOGA)

Updated 22 November 2025
  • The DOGA framework introduces intent-conditioned micro-planning via outline retrieval and dynamic prompt assembly to guide LLM outputs during persuasive dialogues.
  • It leverages a finite-state machine for intent classification, ensuring coherent dialogue transitions and strategic alignment in a telemarketing context.
  • The approach integrates a pre-verified script library and mathematical formulations to minimize factual hallucination and bolster response fidelity.

The Dynamic Outline-Guided Agent (DOGA) is a modular inference-stage framework designed to inject turn-level strategic structure into LLMs during goal-driven, multi-turn persuasive dialogue. Developed within the AI-Salesman architecture for telemarketing, DOGA introduces explicit, intent-conditioned micro-planning at each turn by dynamically retrieving and personalizing vetted outline steps from an offline script library, thereby addressing the strategic brittleness and factual hallucination endemic to generic prompt-based deployments of LLMs (Zhang et al., 15 Nov 2025).

1. Inference Architecture and Core Operational Cycle

DOGA operates exclusively during inference, orchestrating each LLM response through three tightly integrated sub-tasks: intent classification, outline retrieval/personalization, and dynamic prompt assembly. At dialogue turn tt, the agent executes:

  1. Intent Classification: A lightweight intent classifier (fine-tuned Qwen2.5-7B) leverages the current dialogue context, Ht1H_{t-1}, and the user’s latest utterance UtU_t to assign a sales intent label ItI_t (e.g., Business_Analysis, Objection_Handling). A finite-state machine constrains allowable transitions, ensuring strategic coherence.
  2. Outline Retrieval & Personalization: For the inferred ItI_t, the agent retrieves one or more high-performing templates from the offline script library L(It)\mathcal L(I_t). These templates, distilled from historical conversions, contain parameterized placeholders (e.g., { ⁣ ⁣user_onboard_days}\{\!\!\text{user\_onboard\_days}\}), populated on-the-fly using the static user profile MM.
  3. Dynamic Prompt Assembly: The populated outline (the dynamic outline) is concatenated with a static system prompt (encoding agent persona, business rules, and immutably defined constraints) and full dialogue history, yielding the constructed prompt PtP_t, which conditions the LLM response. The inference cycle thus implements:

At=argmaxAP(APstaticHt1D(It,M))A_t^* = \arg\max_{A} P(A \mid P_\text{static} \oplus H_{t-1} \oplus D(I_t, M))

where D(It,M)D(I_t, M) denotes the personalized outline.

2. Algorithmic Process: Outline Generation and Turn Evolution

The turn-level DOGA algorithm formalizes the control flow for outline-guided inference. Given dialogue history Ht1H_{t-1}, the new user utterance UtU_t, and user profile MM, the following loop occurs at each agent turn:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
function DOGA_generate(H_{t-1}, U_t, M):
    # 1) Intent classification (constrained by FSM)
    p = IntentClassifier(H_{t-1}, U_t)
    I_t = argmax_i p[i]
    # 2) Template retrieval & ranking
    candidates = 𝓛(I_t)
    for each T in candidates:
        score_T = cosine(Embed(T), Embed(U_t))
    T_star = argmax_{Tcandidates} score_T
    # 3) Personalize placeholders
    outline O_t = FillPlaceholders(T_star, M)
    # 4) Assemble prompt
    P_t = P_static  H_{t-1}  O_t
    # 5) Generate agent response
    A_t = LLM_generate(P_t)
    return A_t, I_t
At turn t+1t+1, the updated history Ht=Ht1UtAtH_t = H_{t-1} \oplus U_t \oplus A_t is supplied. This process ensures each agent action is tightly tethered to an explicit, contextualized micro-plan.

3. Offline Script Library Construction and Integration

DOGA’s outline provisioning depends on a pre-constructed library of scripts. This library is generated offline in a sequence of steps:

1. Data Collection & Intent Annotation:

A dataset of high-performing tele-sales dialogues is annotated (via GPT-4) with top-level intent tags and explicitly invoked user facts at each turn.

2. Script Extraction:

For each (turn, intent) tuple, the corresponding agent utterance is generalized into a succinct, bullet-point template via GPT-4 re-writing.

3. Clustering & Summarization:

Templates for each intent are embedded (using Qwen3-Embedding), clustered according to greedy cosine similarity (threshold > 0.8), and each cluster is summarized by GPT-4 into a canonical outline template.

The finalized library L={i[T1i,,Tnii]}iI\mathcal L = \{i \mapsto [T^i_1, \dots, T^i_{n_i}]\}_{i \in \mathcal I} enables targeted, fidelity-promoting retrieval and prompt construction. At runtime, only L(It)\mathcal L(I_t) is accessed to preserve computational efficiency.

Phase Method/Tool Used Output
Data Collection & Annotation GPT-4 Annotated corpus with intents
Script Extraction GPT-4 Bullet-point outline templates
Clustering & Summarization Qwen3-Embedding + GPT-4 Canonical outline templates by intent

4. Mathematical Formulation of Template Selection and Prompt Utility

DOGA formalizes outline retrieval and agent action as optimization problems for both template selection and response generation:

Template Ranking:

When multiple templates for intent ItI_t exist, selection is based on semantic similarity between candidate TT and UtU_t:

Scoresel(T;Ut)=cos(Embed(T),Embed(Ut))\mathrm{Score}_{\text{sel}}(T; U_t) = \cos(\mathrm{Embed}(T), \mathrm{Embed}(U_t))

The selected template is

Tt=argmaxTL(It)Scoresel(T;Ut)T^*_t = \arg\max_{T \in \mathcal L(I_t)} \mathrm{Score}_{\text{sel}}(T; U_t)

Turn-Level Utility Function:

The overall agent action at turn tt maximizes the (conditional) probability of completion under the dynamic system prompt:

At=argmaxAP(APstaticHt1Ot)A^*_t = \arg\max_{A} P(A \mid P_{\text{static}} \oplus H_{t-1} \oplus O_t)

The injected outline OtO_t acts as a soft utility shaping function, steering the LLM generation toward alignment with empirically validated strategies.

5. Outline-to-Utterance Mapping: Concrete Example

Consider the following user message at turn 3:

"I’m worried I won’t see a return on this spend."

DOGA's processing sequence:

  • Intent classification: assigns Objection_Handling.
  • Retrieved & personalized outline O3O_3:

1. Acknowledge the budget concern. 2. Remind of “Flash Recharge Bonus” eligibility (< 30 days onboarded). 3. Emphasize coupon value and valid use cases. 4. Propose next action: “Shall we top up now?”

The LLM, conditioned on this explicit, numbered outline and domain constraints, produces an utterance in which each bullet structurally maps to a clause or sentence:

"I completely understand your caution around budget. Since you joined 15 days ago and have spent under \$10, you qualify for our Flash Recharge Bonus. If you add \$50 today, you’ll receive a \$10 coupon valid on keyword-bidding ads. Would you like me to walk you through that recharge now?"

This mapping illustrates DOGA’s decoupling of strategy selection (outline) from strategy execution (natural language realization), ensuring both task alignment and user adaptation.

6. Faithfulness, Robustness, and Error Prevention Mechanisms

DOGA incorporates multiple mechanisms to enforce factuality, strategic robustness, and resistance to hallucination:

  • Domain Constraint Injection:

The static system prompt PstaticP_\text{static} encodes all promotional rules, eligibility criteria, pricing, and forbidden topics, prohibiting generation of unverified claims.

  • Intent-FSM Enforcement:

The intent classifier is governed by a finite-state machine, enforcing allowed transitions (e.g., prohibiting jumps from Objection_Handling to Business_Analysis if incoherent), which maintains dialogue structure and progression.

  • Pre-verified Script Library:

Only templates distilled, clustered, and manually sanity-checked from successful real-world dialogues are admissible; ungrounded strategy steps are precluded by construction.

  • Prompt Schema Control:

The dynamic prompt utilizes a rigid, bullet-point schema in presenting the outline to the LLM, implicitly biasing the model towards following the stepwise plan.

Collectively, these mechanisms ensure that every LLM response is both strategically appropriate and factually correct within the given telemarketing domain (Zhang et al., 15 Nov 2025). This architecture provides a systematic, micro-planning-centric alternative to conventional few-shot prompting and post-hoc verification in controlled, goal-oriented conversational AI.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Dynamic Outline-Guided Agent (DOGA).