The paper presents a paper that explores the concept of Leap-of-Thought (LoT) capabilities within LLMs, particularly as it applies to the generation of creative humor. Leap-of-Thought is defined as a non-sequential, creative paradigm that involves making strong associations and knowledge leaps. This is different from the more commonly known Chain-of-Thought (CoT) approach, which guides LLMs to reason step-by-step and motivates their logical reasoning abilities.
CoT has been effective for logical reasoning tasks where each subsequent thought is built upon the previous one, which is a more sequential thinking process. However, CoT might limit solutions in creative problem-solving scenarios where non-sequential thinking or leaps in thought are required, such as offering creative humor in response to a prompt. Addressing this gap, the authors introduce a creative Leap-of-Thought (CLoT) paradigm to enhance LLMs’ LoT abilities.
To investigate LoT in LLMs, the authors crafted the multimodal and multilingual Oogiri-GO dataset containing over 130,000 samples from the Oogiri game. The game requires participants to respond humorously and unexpectedly to images, text, or both, making it suitable for the paper of LoT. The researchers observed that existing LLMs have insufficient LoT ability for creative humor generation. Accordingly, they introduced the CLoT paradigm, which has two main stages.
Firstly, associable instruction tuning was designed to formulate the Oogiri-GO dataset into instructional tuning data to train pretrained LLMs for LoT humor generation and discrimination abilities. This stage utilizes instruction templates that offer clues and encourage uninhibited exploration to foster creative thinking. The second stage, exploratory self-refinement, enables the LLM to produce more creative LoT data by exploring parallels between seemingly unrelated concepts and to refine itself with high-quality data.
The paper shows that the CLoT-integrated LLMs outperform vanilla and CoT-integrated LLMs in multiple-choice and ranking questions within the Oogiri game. Additionally, CLoT can boost creative abilities on tasks like the cloud guessing game and the divergent association task, demonstrating its broader applicability.
The paper argues that employing review data, such as human rankings, could enhance CLoT through reinforcement learning methods. Additionally, future work should explore methods to minimize LLM training and maintain existing knowledge during instructional tuning. In conclusion, the proposed CLoT represents a significant step forward in enabling LLMs to engage in creative and innovative applications across various domains.