Papers
Topics
Authors
Recent
Search
2000 character limit reached

STRIVE Protocol Overview

Updated 27 December 2025
  • STRIVE is a collection of four distinct protocols, each defined by structured methodologies for claim verification, navigation, question quality assessment, and WLAN MAC protocol design.
  • Each protocol employs domain-specific techniques—such as explicit multi-step reasoning, graph-based navigation, iterative LLM feedback, and enhanced RTS/CTS mechanisms—to achieve robust and verifiable results.
  • Empirical evaluations demonstrate that STRIVE protocols deliver significant performance gains, improved fairness, and enhanced reliability across varied applications.

STRIVE refers to four independent research protocols, each with distinct methodologies and targeted domains. This entry provides a comprehensive overview of all major STRIVE protocols in the academic literature, covering (1) structured self-improvement for claim verification, (2) multi-layer VLM-guided representation for navigation, (3) iterative refinement in question quality estimation, and (4) FD operation in WLAN MAC protocols. Each protocol defines STRIVE as an acronym for a domain-specific framework with strong empirical justification.

1. STRIVE: Structured Reasoning for Self-Improvement in Claim Verification

STRIVE (“Structured Reasoning for Self-Improved Verification”) addresses claim verification—the task of determining if a claim cc is Supported or Refuted given evidence set E={e1,...,en}E = \{e_1, ..., e_n\}—by generating explicit, auditable multi-step reasoning chains. Unlike unstructured chain-of-thought approaches, STRIVE imposes formal structure through three sequential modules: Claim Decomposition (CD), Entity Analysis (EA), and Evidence Grounding Verification (EG). The self-improvement protocol selectively fine-tunes a LLM on its own high-quality, structurally valid chains to prevent the propagation of erroneous reasoning, which is a major failure mode in naïve self-improvement (Gong et al., 17 Feb 2025).

Framework Overview:

  • Initial Model and Warm-up: Begin with a base Llama-3-8B-Instruct model MM. Warm-up MM by LoRA fine-tuning on a seed set DhD_h of H=10H=10 human-annotated examples, each with gold-structured reasoning chains.
  • Chain Generation: Use MM^* (post-warm-up) to generate candidate reasoning chains rir_i on each of N=600N=600 training examples.
  • Selection: Accept only chains meeting both (a) correct label (rule J(r)J(r): any sub-claim Status=Refuted ⇒ Refuted; else Supported) and (b) strict structured format constraints (f(r)f(r)). For failures, a hint-based prompt offers a second chance at correction.
  • Self-Improvement: Final model MstM_{st} is produced by additional LoRA fine-tuning on DhD_hDstD_{st} (the combined high-quality chains).

Structured Modules:

  • Claim Decomposition: Explicitly partition cc into sub-claims C1,...,CkC_1,...,C_k, with each evaluated and resolved separately. Block syntax: Cj: ... Status: Supported/Refuted.
  • Entity Analysis: For each sub-claim, resolve ambiguous entities to precise Wikipedia entries, then verify these resolutions with evidence citations.
  • Evidence Grounding: Each sub-claim must be tied to a specific evidence index, ensuring local justification.

Algorithmic Protocol (simplified):

  1. MM^* \leftarrow fine-tune(MM, DhD_h)
  2. For i=1...Ni=1...N: r^iM(T(ci,Ei))\hat{r}_i \leftarrow M^*(T(c_i, E_i)), y^iJ(r^i)\hat{y}_i \leftarrow J(\hat{r}_i)
  3. D1D_1 \leftarrow Correctly labeled examples
  4. Hint-based regeneration D2D_2 for failures
  5. Dst(D1D2)D_{st} \leftarrow (D_1 \cup D_2) filtered by f(r)f(r)
  6. Train MstM_{st} \leftarrow fine-tune(MM, DhDstD_h \cup D_{st})

Results on HOVER:

  • Macro-F1: HOVER-2: 76.13±0.8476.13 \pm 0.84, HOVER-3: 70.50±0.5570.50 \pm 0.55, HOVER-4: 68.50±1.2768.50 \pm 1.27
  • Gains: +31.4%+31.4\% over base, +20.7%+20.7\% over Chain of Thought (STaR*)

Ablation: Removing Claim Decomposition yields maximal drop; Entity Analysis and Format Checking are also critical for performance (Gong et al., 17 Feb 2025).

2. STRIVE: Structured Representation Integrating VLM Reasoning for Efficient Object Navigation

STRIVE (“STructured Representation Integrating VLM Reasoning for Efficient Object Navigation”) defines a multi-layer embodied agent protocol for sample-efficient object navigation with VLM integration (Zhu et al., 10 May 2025). The agent’s internal world model at time tt is an attributed spatial graph:

Rt=(V,E),V=VroomVvpVobjR_t = (V, E), \quad V = V^{room} \cup V^{vp} \cup V^{obj}

where VvpV^{vp} is the set of key viewpoints, VobjV^{obj} are 3D object instances, VroomV^{room} are topologically inferred rooms, and EE encodes spatial or semantic relationships.

Protocol Components:

  • Node Construction:
    • Viewpoint nodes are created by local region coverage, using a threshold ζcover=1.0\zeta_{cover}=1.0 m with polygonal ray casting.
    • Object nodes are extracted via 2D mask segmentation (Grounding DINO+SAM) lifted to 3D, plus class and confidence.
    • Room nodes arise via 2D projection, wall detection, and seeded watershed segmentation.
  • Navigation Policy: Two stages:
    • πhigh\pi_{high}: VLM-guided high-level planner, using a JSON graph snapshot and explicit penalization of previously traversed rooms via

    d~(rcurrrk)=dgeo(rcurrrk)(1+αVexploredvp(rk)/H)\tilde{d}(r_{curr} \to r_k) = d_{geo}(r_{curr} \to r_k) \cdot (1 + \alpha \cdot |V^{vp}_{explored}(r_k)| / H)

    (α=0.1\alpha=0.1, HH remaining horizon). - πlow\pi_{low}: Frontier-based exploration within a room, with a VLM early-stopping criterion based on internal graph state.

Use of VLM:

  • Query is triggered only at key decision points (post-frontier exhaustion, ambiguous objects, or room selection), with a fixed prompt that compels chain-of-thought reasoning and outputs in explicit format (“steps”, “final_answer”, “reason”).

Performance and Metrics:

  • On HM3D, STRIVE achieves SR =79.6%= 79.6\% (+7.1%+7.1\% over CogNav) and SPL =38.7= 38.7 (+12.5%+12.5\%). On RoboTHOR and MP3D, state-of-the-art success and efficiency metrics are reported.

  • Ablation studies confirm that each graph layer (object, room) and the VLM early-stop gate are essential for maximal navigation efficiency (Zhu et al., 10 May 2025).

3. STRIVE: Iterative Multi-LLM Refinement for Question Quality Estimation

STRIVE (“Structured Thinking and Refinement with multiLLMs for Improving Verified Question Estimation”) is an automated evaluation protocol for grading educational questions along five axes: grammaticality, appropriateness, relevance, novelty, and complexity (Deroy et al., 8 Apr 2025). The method uses dual “Think & Improve” modules (LLM instances TM₁, TM₂) to iteratively generate diverse candidate evaluations, judge best strength-weakness pairs, and refine scores until consensus is reached.

Algorithmic Steps:

  • Initialize with N=10N=10 candidate strengths and weaknesses, scored by TM₁.

  • Alternating, TM₂ and TM₁ each:

    • Generate 10 new strength-flaw variants (temperature diversity ensured).
    • Judge and select the best.
    • Produce metric vector vt=[sGramt,sAppt,sRelt,sNovt,sComt]v^t = [s_{Gram}^t, s_{App}^t, s_{Rel}^t, s_{Nov}^t, s_{Com}^t].
  • Convergence is declared when score vectors match for two successive rounds (vk1=vk=vk+1v_{k-1} = v_k = v_{k+1}).

Scoring:

  • LLMs output scalar scores sm[1,5]s_m \in [1,5] per metric, with strict equality for stopping (ϵm=0\epsilon_m = 0).
  • Prompts enforce metric definitions as system messages and clarify the evaluation or candidate generation task.

Empirical Validation:

  • On EduProbe (1,000 questions, human-rated baseline): STRIVE with GPT-4 achieves considerably higher Pearson’s rr with human scores, especially in relevance (rRel=0.420.61r_{Rel}=0.42\to0.61) and appropriateness (rApp=0.410.62r_{App}=0.41\to0.62), exceeding single-pass LLM (Deroy et al., 8 Apr 2025).
  • Exact-score matches increase for all metrics; qualitative analysis shows STRIVE corrects for ambiguity and better aligns with pedagogical intent.

4. STRIVE: MAC Protocol for Simultaneous Transmit and Receive (STR) in WLANs

STRIVE (“Simultaneous Transmit and Receive Operation in Next Generation IEEE 802.11 WLANs: A MAC Protocol Design Approach”) is a medium access control protocol designed to enable full-duplex (FD) operation—both bi-directional (BFD) and uni-directional (UFD)—in IEEE 802.11 ax networks (Aijaz et al., 2017). STRIVE’s objective is the integration of FD mode with strict backward compatibility and minimal protocol overhead.

Key Protocol Phases:

  • FD Capability Discovery: Utilizes reserved bits in the standard 802.11 Capability Information field in beacon and association-request frames. No additional IE required; legacy HD stations ignore these bits.
  • Handshake Mechanism: Enhanced RTS/CTS. The AP sets a reserved bit in CTS to signal FD mode (CTS-FD). Duration fields are precisely computed:

D0=3SIFS+TCTS+TDATA+TACKD_0 = 3\cdot SIFS + T_{CTS} + T_{DATA} + T_{ACK}

D1=D0(TCTS+SIFS)D_1 = D_0 - (T_{CTS} + SIFS)

  • Node Selection for UFD: Interference graphs are constructed from neighborhood tables, with the AP selecting UFD partners satisfying adjacency (A(1,2)=0A(1,2)=0) and SINR thresholds.

Contention Unfairness Mitigation:

  • CTS-FD-aware overhearing disables EIFS start on corrupted packets during FD periods.
  • FD Transmission Indicator (FDTI) is broadcast immediately post-BFD/UFD, signaling STAs to reset EIFS and restore fairness.

Analytical Metrics:

  • Contention Unfairness Index (CUI), based on Jain’s index, measures fairness restoration.
  • Simulations demonstrate STRIVE achieves throughput gains up to 1.9× and restores CUI to near 1 in high-density FD deployments (Aijaz et al., 2017).

5. Comparative Summary Table

STRIVE Protocol Domain Core Mechanism Unique Feature(s)
Claim Verification (Gong et al., 17 Feb 2025) Structured self-improvement Claim decomposition, entity analysis, audit trail
Navigation (Zhu et al., 10 May 2025) Multi-layer VLM+graph policy Representation: viewpoint/object/room layers
Question Quality (Deroy et al., 8 Apr 2025) Multi-LLM iterative feedback Dual-LLM feedback loop, 5-score convergence
WLAN MAC (Aijaz et al., 2017) Enhanced 802.11 STR protocol FD/HD coexistence, CUI fairness, FDTI frame

6. Limitations and Future Directions

Each STRIVE protocol acknowledges constraints intrinsic to its domain:

  • Claim Verification: Dependency on a diverse annotated seed set and only single-round self-improvement explored; multi-round or learned verifiers are proposed for future work (Gong et al., 17 Feb 2025).
  • Navigation: Current scaling limited to representation resolution and hardware; extension to 70B+ LLMs or evidence types (tables/images) is a future direction (Zhu et al., 10 May 2025).
  • Question Quality: Dependent on prompt design and metric definitions; further generalization to more diverse question types and domains is open (Deroy et al., 8 Apr 2025).
  • WLAN MAC: Protocol is evaluated in system-level simulation; real-world hardware effects and large-scale deployments remain to be studied (Aijaz et al., 2017).

7. Significance and Impact

STRIVE protocols define research frontiers in verifiable reasoning, embodied navigation, evaluation automation, and network protocol design, each introducing structured, supervision-rich approaches to domains where prior art was limited by unstructured heuristics or protocol rigidity. Their published evaluations demonstrate quantitative dominance, especially where structure and supervision are leveraged to minimize critical error modes, enable compositional exploration, or enforce system-wide guarantees. Ongoing research continues to explore scaling and wider applicability across tasks and modalities.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to STRIVE Protocol.