Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts
Detailed Answer
Thorough responses based on abstracts and some paper content
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash
85 tokens/sec
GPT-4o
77 tokens/sec
Gemini 2.5 Pro Pro
60 tokens/sec
o3 Pro
16 tokens/sec
GPT-4.1 Pro
66 tokens/sec
DeepSeek R1 via Azure Pro
34 tokens/sec
2000 character limit reached

Descriptor-Based Prompting

Last updated: June 10, 2025

Prompting LLMs ° has emerged as a powerful paradigm for adapting their capabilities to diverse downstream tasks without extensive fine-tuning. A key aspect of this approach is the use of "descriptors" within the prompt – explicit pieces of information that guide the model's behavior, task execution, or output format. Descriptor-based prompting leverages structured inputs, specific tokens, or learned representations that function as cues to steer the pretrained model °, often proving more efficient and flexible than full model fine-tuning, especially in few-shot or resource-constrained scenarios °. This article synthesizes recent research exploring various facets of descriptor-based prompting, from automatic selection and dynamic generation to composition, persistence, and user interaction.

Significance and Background

Traditional methods for adapting large pretrained models ° to new tasks often rely on fine-tuning the entire model or task-specific layers, which can be computationally expensive, require large amounts of labeled data, and necessitate storing a separate model instance for each task (Zhang et al., 2022 ° , Swamy et al., 2023 ° , Liu et al., 2023 ° ). Prompt-based learning ° offers an alternative by formulating downstream tasks to resemble the model's pretraining objective (e.g., fill-in-the-blank) and conditioning the model with task-specific text. However, finding effective prompts requires significant human effort and iteration (Strobelt et al., 2022 ° ), as small changes in wording can lead to substantial performance differences ° (Desmond et al., 13 Mar 2024 ° ).

Descriptor-based prompting addresses these challenges by explicitly structuring or augmenting the prompt with information that describes the task, desired output, or relevant context. This approach can simplify the prompt engineering process (Strobelt et al., 2022 ° ), improve control over the model's output (Liu et al., 2023 ° ), and enable more efficient adaptation across tasks and modalities (Zhang et al., 2022 ° , Liu et al., 2023 ° , Chang et al., 23 Aug 2024 ° ). The core idea is that specifying what is needed using descriptive elements within the prompt allows the pretrained model to leverage its vast pretraining knowledge more effectively for the target task (Wang et al., 2022 ° , Sisson, 2022 ° , Zhang et al., 2022 ° ).

Foundational Concepts

At its core, descriptor-based prompting involves providing the LLM ° with explicit information about the desired behavior or context. This information acts as a "descriptor" that helps the model interpret the input and generate an appropriate output. Descriptors can take various forms:

  • Label Words or Phrases: In classification tasks, mapping class labels to specific words or short phrases is a fundamental form of descriptor. For instance, in sentiment analysis, mapping "positive" to words like "great" (Wang et al., 2022 ° ). This approach can be extended to domain-specific vocabularies where the meaning of terms diverges from common usage, such as odor descriptors like "leather" or "fruity" (Sisson, 2022 ° ).
  • Structured Prompt Components: Prompts can be constructed from distinct, labeled sections or components, each serving a specific discourse role or conveying a particular type of information. PromptPrism identifies semantic components ° such as "Instruction" (Task, Guidelines, Role Assumption, Chain-of-Thought), "Contextual/Reference Info" (Few-shot Examples, Knowledge Base), "Output Constraints" (Label Space, Word Limits, Format, Style/Tone), and "Tools" (Jeoung et al., 19 May 2025 ° ). Analysis of enterprise prompt engineering practices also reveals users iterating on components like context, instructions, persona, output length, output format, and labels (Desmond et al., 13 Mar 2024 ° ).
  • Learned Continuous Prompts °: Instead of natural language text, descriptors can be represented as trainable vectors ("soft prompts") in the model's embedding space (Zhang et al., 2022 ° , Yang et al., 2023 ° , Pilault et al., 2023 ° ). These vectors are learned to steer the model's internal representations ° in a task-specific manner (Pilault et al., 2023 ° ). In vision-LLMs, for example, prompt prototypes in embedding space can function as descriptors for image clusters, allowing similar images to use similar prompt prototypes (Zhang et al., 2022 ° ). In speech processing, soft prompts are learned to adapt speech LLMs ° for tasks like classification or generation (Chang et al., 23 Aug 2024 ° ).
  • Control Codes or Attributes: For tasks requiring specific outputs based on instance-level properties, the prompt can be conditioned on explicit attribute codes or descriptive text associated with each input instance (Liu et al., 2023 ° , Chen et al., 2023 ° ). In dialogue systems, this could be a dialogue act label, a persona description, or the current dialog state (Liu et al., 2023 ° , Swamy et al., 2023 ° ). The prompt encoder maps these attributes into continuous prompt vectors (Liu et al., 2023 ° ).

The underlying mechanism often relies on the pretrained model's ability to condition its output probability distribution ° on the provided input, including the descriptor information (Strobelt et al., 2022 ° , Yang et al., 2023 ° ). For classification, this might involve summing the probabilities of tokens associated with each class descriptor (Wang et al., 2022 ° ). Mathematically, the probability of a class yy given an input xx and a set of descriptor tokens S(y)\mathcal{S}(y) might be calculated as: p(yx)=vS(y)p([MASK]=vx) p(y \mid x) = \sum_{v \in \mathcal{S}(y)} p([MASK] = v \mid x') where xx' is the prompted input using a prompt template T(x)\mathcal{T}(x) with a [MASK] token (Wang et al., 2022 ° ). For prototype-based prompting, the probability is a weighted sum over prompt prototypes based on image similarity (Zhang et al., 2022 ° ): Prob(x,c)=k=1Ksim(x,Pk)ProbTk(x,c) \textrm{Prob}(x, c) = \sum_{k=1}^K \textrm{sim}(x, \mathcal{P}_k) \cdot \textrm{Prob}_{\mathcal{T}_k}(x, c) where Pk\mathcal{P}_k are image prototypes and Tk\mathcal{T}_k are prompt prototypes (Zhang et al., 2022 ° ).

Key Methodologies and Developments

Research in descriptor-based prompting has explored various strategies ° for generating, using, and managing descriptors:

Current Applications and State of the Art

Descriptor-based prompting has been successfully applied across a range of tasks and modalities:

Practical Considerations

Implementing descriptor-based prompting involves several practical considerations:

Emerging Trends and Future Directions

The field of descriptor-based prompting is actively evolving, with several promising directions highlighted in the research:

Descriptor-based prompting represents a significant evolution in how we interact with and adapt large pretrained models. By explicitly incorporating descriptive information into the prompting process, researchers are developing methods that are more controllable, efficient, and effective across a growing range of applications.