Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DynaMaR: Dynamic Prompt with Mask Token Representation (2206.02982v1)

Published 7 Jun 2022 in cs.CL and cs.LG

Abstract: Recent research has shown that LLMs pretrained using unsupervised approaches can achieve significant performance improvement on many downstream tasks. Typically when adapting these LLMs to downstream tasks, like a classification or regression task, we employ a fine-tuning paradigm in which the sentence representation from the LLM is input to a task-specific head; the model is then fine-tuned end-to-end. However, with the emergence of models like GPT-3, prompt-based fine-tuning has been proven to be a successful approach for few-shot tasks. Inspired by this work, we study discrete prompt technologies in practice. There are two issues that arise with the standard prompt approach. First, it can overfit on the prompt template. Second, it requires manual effort to formulate the downstream task as a LLM problem. In this paper, we propose an improvement to prompt-based fine-tuning that addresses these two issues. We refer to our approach as DynaMaR -- Dynamic Prompt with Mask Token Representation. Results show that DynaMaR can achieve an average improvement of 10% in few-shot settings and improvement of 3.7% in data-rich settings over the standard fine-tuning approach on four e-commerce applications.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xiaodi Sun (3 papers)
  2. Sunny Rajagopalan (1 paper)
  3. Priyanka Nigam (8 papers)
  4. Weiyi Lu (5 papers)
  5. Yi Xu (302 papers)
  6. Belinda Zeng (16 papers)
  7. Trishul Chilimbi (22 papers)