Papers
Topics
Authors
Recent
2000 character limit reached

SPS: Search, Optimization & Retrieval

Updated 28 December 2025
  • Similar Prompts Searching (SPS) is a framework that identifies semantically similar prompts using structured search, graph models, and statistical techniques.
  • Core methods include search-based approaches like beam search and random walk, as well as retrieval techniques utilizing token-level LSH and KL divergence minimization.
  • Empirical findings demonstrate that SPS improves model efficiency, inference speed, and task performance across diverse NLP and multimodal applications.

Similar Prompts Searching (SPS) refers to algorithmic and statistical approaches for identifying, evaluating, and leveraging prompts that are semantically or functionally similar within large language and multimodal model systems. SPS is a foundational component in prompt optimization, transfer, retrieval-augmentation, KVCache reuse, and creative prompt evolution. It encompasses both discrete and continuous methods, spanning text, visual, and architectural prompt modalities.

1. Formal Frameworks for SPS: Graphs, Search, and Similarity

SPS methodologies model the prompt space—denoted P\mathcal{P}—as a structured domain supporting a spectrum of search and retrieval operations. In "Prompt Optimization as a State-Space Search Problem" (Taneja, 23 Nov 2025), the prompt space is constructed as a directed graph G=(V,E)G=(V,E) where each node pPp\in \mathcal{P} encodes a prompt (represented by a PromptNode including the prompt string, parent, generating operator, and evaluation score). Edges correspond to transformation operators O:P×I×DtrainPO:\mathcal{P}\times \mathcal{I}\times \mathcal{D}_{\rm train}\rightarrow\mathcal{P} mapping one prompt to another via explicit operations such as shortening, adding demonstrations, or reordering content.

For retrieval paradigms such as SemShareKV (Zhao et al., 29 Sep 2025), the notion of similarity is operationalized via token-level and prompt-level locality-sensitive hashing (LSH) on semantically and positionally augmented embeddings—enabling efficient sublinear retrieval of semantically proximate prompts even under significant lexical or structural variation. Visual and cross-modal variants, such as DualCap (Li et al., 28 Oct 2025), extend SPS to dual retrieval (image-to-image and image-to-text) and feature fusion over token or patch-level embeddings.

2. Core Algorithms: Search, Optimization, and Retrieval

Search-based SPS:

Popular SPS techniques include random walk and beam search. Random walk applies randomly selected transformation operators iteratively, updating best-so-far according to a prompt evaluation heuristic:

1
2
3
4
5
6
for t in 1...N:
    choose m in M uniformly
    p_next = m.apply(current, T)
    score_next = Eval(p_next, D)
    if score_next > Eval(best, D): best = p_next
    current = p_next
Beam search expands the top-kk nodes at each depth, systematically exploring transformation combinations and pruning by evaluation score, thereby maximizing the likelihood of locating functionally robust prompts. Formally, at depth \ell, the beam BB_\ell is updated as: B+1=topk{O(p,i,T)pB,OM}B_{\ell+1} = \operatorname{top}_k\{ O(p, i, T) | p \in B_\ell, O \in M \} with scoring h(p)=Eval(p,D)h(p)=Eval(p, D) (Taneja, 23 Nov 2025).

Retrieval-based SPS:

Token-level LSH, as formalized in SemShareKV, maps positionally encoded and normalized token embeddings xRdx\in\mathbb{R}^d through LL hash tables, each based on kk random projections and bucket width ww. The hash function per table is: g(x)=(h1(x),...,hk(x)),hj(x)=rjx+bwg(x)=\left(h_1(x),...,h_k(x)\right), \quad h_j(x)=\left\lfloor \frac{r_j\cdot x + b}{w}\right\rfloor This fuzzy-matching infrastructure efficiently indexes and queries candidate tokens or prompt segments, yielding high recall/precision even for paraphrased or reordered prompts (Zhao et al., 29 Sep 2025).

Functional Similarity Search:

"Prompts have evil twins" (Melamed et al., 2023) frames SPS as discrete maximum likelihood estimation, seeking prompts p^\hat{p} that approximate the output distribution P(p)P(\cdot|p^*) of a reference prompt pp^* using greedy coordinate gradient (GCG) optimization over the KL divergence between induced distributions: dkl(pp)=KL[P(p)P(p)]d_{\rm kl}(p^*\Vert p)=\mathrm{KL}\left[P(\cdot|p^*)\Vert P(\cdot|p)\right]

Task-level Prompt Selection:

In Vision In-Context Learning (VICL), SPS reduces to identifying a prompt subset P\mathcal{P}^* minimizing aggregate task loss: P=argminPS(xq,yq)DL(f(P,xq),yq)\mathcal{P}^* = \arg\min_{\mathcal{P}\subseteq\mathcal{S}} \sum_{(x_q,y_q)\in\mathcal{D}} \mathcal{L}(f(\mathcal{P},x_q),y_q) using top-K or greedy search strategies optimized over validation data (Zhu et al., 15 Jan 2025).

3. Transformation Operators and Mutation

Prompt optimization frameworks concretize SPS by defining operator sets M\mathcal{M} acting over prompt substrings:

Operator Description
make_concise Shorten and clarify prompt text
add_examples Extend with few-shot input–output pairs
reorder Change segment order (e.g., swap 'Instruction'/'Format')
make_verbose Expand with additional guidance/detail

Empirical analysis reveals that make_concise dominates successful search trajectories, with add_examples and reorder being contextually important. Verbosity is rarely selected, indicating brevity is beneficial for instruction execution (Taneja, 23 Nov 2025).

Visual SPS, as in DualCap (Li et al., 28 Oct 2025), employs retrieval and chunk-based keyword distillation (POS-chunked nouns, verbs, adjectives) as operators that inject explicit scene semantics into vision-LLMs.

SCAPE (Lim et al., 31 Jan 2024) treats architectural prompt genes as mutable attributes (style, site, color, lighting, shape, material), leveraging human-guided selection, GPT-4-driven mutation/crossover, and stochastic attribute re-sampling to explore the conceptual prompt space.

4. Evaluation Metrics and Empirical Findings

Prompt candidates are assessed using:

  • String-match accuracy: sstr(p,x,y)=1[fp(x)=y]s_{\mathrm{str}}(p,x,y) = \mathbf{1}[f_p(x)=y]
  • Critic LM evaluation: scrit(p,x,y)=1[C(fp(x),y)=true]s_{\mathrm{crit}}(p,x,y)=\mathbf{1}[\mathcal{C}(f_p(x),y)=\mathrm{true}], where C\mathcal{C} is a large LM judge.
  • Task loss: e.g., mIOU for segmentation/detection, MSE for regression tasks.

For retrieval methods, match precision/recall is measured relative to brute-force nearest-neighbor search. Token-level LSH yields >92%>92\% recall with L=8L=8, k=10k=10, w=0.6w=0.6, and manageable index sizes (Zhao et al., 29 Sep 2025).

Prompt search impact:

In NLP prompt search (Taneja, 23 Nov 2025), beam search achieves development accuracy gains from 0.40 → 0.80 for reasoning but test improvements are modest (0.20 → 0.50), indicating path-specific overfitting (see Table below).

TaskSeedOne‐ShotRandom WalkBeam Searchhlinereasoning0.400.200.600.80 (dev) reasoning0.200.300.500.50 (test)\begin{array}{l|rrrr} \text{Task} & \text{Seed} & \text{One‐Shot} & \text{Random Walk} & \text{Beam Search}\\hline \text{reasoning} & 0.40 & 0.20 & 0.60 & 0.80 \ \text{(dev)}\ \text{reasoning} & 0.20 & 0.30 & 0.50 & 0.50 \ \text{(test)} \end{array}

Prompt operator frequency in beam-best paths:

OperatorFrequencyhlinemake_concise4 add_examples2 reorder2 make_verbose0\begin{array}{l|r} \text{Operator} & \text{Frequency}\\hline \text{make\_concise} & 4\ \text{add\_examples} & 2\ \text{reorder} & 2\ \text{make\_verbose} & 0 \end{array}

SCAPE yields +67% novelty over basic DALL-E (Lim et al., 31 Jan 2024). In image captioning, DualCap’s SPS pipeline boosts CIDEr from 119.7 → 123.6 and SPICE from 21.3 → 22.0 (Li et al., 28 Oct 2025). SemShareKV achieves up to 6.25×6.25\times LLM inference speedup with 42% lower GPU memory at comparable output fidelity (Zhao et al., 29 Sep 2025). SPT achieves up to 90% improvement in response diversity (DIST-2) in dialog generation (Huang et al., 26 Jun 2024).

5. Architectural and Retrieval Design Variants

Dense vs. Sparse Retrieval:

SPT (Huang et al., 26 Jun 2024) implements a trainable dense retriever with context-prompt contrastive learning, mapping queries to the most relevant soft prompt for each conversational turn, using cosine similarity and softmax-normalized selection. Context diversity is enforced using contrastive regularization, ensuring prompt pool coverage and non-collapse.

Chunked Retrieval:

SemShareKV builds a global LSH index over overlapping token window embeddings, supporting prompt-level similarity search by tallying per-token match frequencies and scoring via a softmax-weighted token proximity function: Sim(Q,P)=1QqQexp(αxqxmatch(q)2)\mathrm{Sim}(Q,P) = \frac{1}{|Q|}\sum_{q\in Q}\exp(-\alpha \|x_q-x_{\text{match}(q)}\|^2) This approach scales to massive prompt libraries via sublinear indexing and sharding (Zhao et al., 29 Sep 2025).

Creative/Evolutionary Search:

SCAPE iterates over a population of attribute-vectored prompts guided by human selectors, GPT-4-driven mutation/crossover, and explicit memory (history of taboo and encouraged features). Mutation probability per attribute is Pmutate(a)=0.5P_{\rm mutate}(a)=0.5 if unrated, and crossover selection is weighted according to user ratings (Lim et al., 31 Jan 2024).

6. Limitations, Transferability, and Future Directions

Limitations of SPS frameworks include:

Transferability is empirical: evil twin prompts transfer between LLMs and across model sizes, though forward compatibility is not guaranteed (Melamed et al., 2023). SPS is extensible to multi-hop and multi-modal settings, with numerous directions for enhancement, such as learned phrase mining, deeper fusion layers for visual language fusion, and hybrid soft/hard prompt search (Li et al., 28 Oct 2025, Zhao et al., 29 Sep 2025).

7. Applications and Practical Guidelines

Applications of SPS span:

Implementation guidelines:

A salient trend is the growing emphasis on principled, semantically aware, and resource-efficient SPS, underpinned by empirical evidence of improved model performance, generation diversity, and search efficiency across various domains.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Similar Prompts Searching (SPS).