Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication (1911.04192v2)

Published 11 Nov 2019 in cs.CL, cs.AI, and cs.CV

Abstract: Visual storytelling aims to generate a narrative paragraph from a sequence of images automatically. Existing approaches construct text description independently for each image and roughly concatenate them as a story, which leads to the problem of generating semantically incoherent content. In this paper, we propose a new way for visual storytelling by introducing a topic description task to detect the global semantic context of an image stream. A story is then constructed with the guidance of the topic description. In order to combine the two generation tasks, we propose a multi-agent communication framework that regards the topic description generator and the story generator as two agents and learn them simultaneously via iterative updating mechanism. We validate our approach on VIST dataset, where quantitative results, ablations, and human evaluation demonstrate our method's good ability in generating stories with higher quality compared to state-of-the-art methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Ruize Wang (11 papers)
  2. Zhongyu Wei (98 papers)
  3. Ying Cheng (17 papers)
  4. Piji Li (75 papers)
  5. Haijun Shan (8 papers)
  6. Ji Zhang (176 papers)
  7. Qi Zhang (785 papers)
  8. Xuanjing Huang (287 papers)
Citations (13)