Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Long and Diverse Text Generation with Planning-based Hierarchical Variational Model (1908.06605v2)

Published 19 Aug 2019 in cs.CL and cs.LG

Abstract: Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization. Experiments show that our model outperforms state-of-the-art baselines in long and diverse text generation.

Planning-based Hierarchical Variational Model for Long and Diverse Text Generation

The paper introduces a novel approach for data-to-text generation focused on improving the generation of long and diverse texts. The primary contribution is the Planning-based Hierarchical Variational Model (PHVM), which is designed to overcome limitations of existing neural models in generating text that is both extensible in length and rich in diversity.

Model Architecture and Innovations

PHVM integrates a high-level planning mechanism with hierarchical latent variables to decompose the task of generating long texts into manageable sub-tasks. This approach is motivated by the observation that human writers tend to organize content before exploring specific text realization. The model operates in two stages: planning and generation.

In the planning stage, the model uses a global latent variable, zpz^p, to capture the diversity of potential text planning sequences. This variable guides the planning decoder to segment input data into a sequence of groups. Each group comprises input items intended for coverage within a single sentence. This group-based planning ensures that extended text generation can be parsed into coherent sentence-level sub-tasks.

Once planning is complete, the model proceeds with the hierarchical generation process. This involves two levels: sentence-level control and word-level realization. Here, local latent variables, ztsz_t^s, are used for sentence realization where dependencies among these variables are explicitly modeled to enhance inter-sentence coherence. The hierarchical structure of these latent variables enables the model to capture variations at different levels of text construction.

Experimental Evaluation

PHVM was evaluated on two datasets: a newly constructed advertising text dataset and a recipe generation dataset. Notably, the advertising dataset challenges models to generate diverse descriptions from structured data about clothing items. Results demonstrated PHVM’s superiority over baseline models, including Checklist, CVAE, Pointer-S2S, and Link-S2S, particularly in metrics like coverage, distinct-4, and repetition-4, which measure the diversity and non-redundancy of the generated text.

In the advertising text generation task, PHVM produced texts with higher coverage of input attributes and greater variety, as evidenced by significantly higher scores in the diversity metric distinct-4. Additionally, the model's capability to plan content effectively was showcased by its ability to generate coherent and non-redundant text sequences. In recipe generation, PHVM similarly outperformed baselines by ensuring a greater use of given ingredients and producing more distinct procedural descriptions.

Theoretical and Practical Implications

The introduction of PHVM marks a progression in data-to-text generation by systematically addressing key issues of inter-sentence coherence and expression diversity. Theoretically, this model demonstrates the efficacy of combining planning with hierarchical latent variables, an approach that may inspire further research into hierarchical methods for narrative tasks in AI such as story generation or summarization.

Practically, the enhanced ability to generate long and nuanced descriptions offers direct applications in automated content creation for digital platforms, potentially benefiting sectors like e-commerce and automated reporting. The capability to generate diverse text from the same input also suggests applications in creative writing and content personalization, adapting outputs to various audience segments without significant data modification.

Future Directions

The paper opens several avenues for future research. Extending this hierarchical planning approach to other domains such as machine translation or dialog systems presents a promising direction. Investigating the integration of more complex input structures and the potential for real-time content adaptation could further broaden PHVM’s applicability. Moreover, exploring the balance between planning diversity and target specificity may yield insights into optimizing generation models across different contexts.

Overall, the paper presents a sophisticated model that adeptly addresses present challenges in long text generation, setting a foundation for future advancements in AI-driven narrative generation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhihong Shao (20 papers)
  2. Minlie Huang (225 papers)
  3. Jiangtao Wen (22 papers)
  4. Wenfei Xu (2 papers)
  5. Xiaoyan Zhu (54 papers)
Citations (104)