Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation (1904.03396v2)

Published 6 Apr 2019 in cs.CL and cs.AI

Abstract: Data-to-text generation can be conceptually divided into two parts: ordering and structuring the information (planning), and generating fluent language describing the information (realization). Modern neural generation systems conflate these two steps into a single end-to-end differentiable system. We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization. For training a plan-to-text generator, we present a method for matching reference texts to their corresponding text plans. For inference time, we describe a method for selecting high-quality text plans for new inputs. We implement and evaluate our approach on the WebNLG benchmark. Our results demonstrate that decoupling text planning from neural realization indeed improves the system's reliability and adequacy while maintaining fluent output. We observe improvements both in BLEU scores and in manual evaluations. Another benefit of our approach is the ability to output diverse realizations of the same input, paving the way to explicit control over the generated text structure.

Authors (3)

Amit Moryossef (25 papers)
Yoav Goldberg (142 papers)
Ido Dagan (72 papers)

Citations (179)

View on Semantic Scholar

Summary

Separating Planning from Realization in Neural Data-to-Text Generation

The paper "Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation" explores the concept of dividing the process of converting structured data into written text into two distinct stages: planning and realization. The authors propose a generative model that separates text planning—a symbolic, rule-based process—from realization, which uses modern neural networks to craft fluent text outputs. This division allows the neural component to focus solely on generating fluent language, alleviating its need to control high-level text structuring decisions.

Conceptual Framework

The methodology is centered on the transformation of RDF triplets into coherent text forms, as demonstrated in the WebNLG corpus. A triplet might involve entities like a person's name, their birthplace, and their employer, which need to be ordered and structured before a final text is synthesized. The paper suggests a text-planning stage, which organizes these triplets into structures that can later be converted into sentences. This modeling approach is thought to be more faithful to input data and supports the production of diverse text outputs without compromising fluency.

Key Findings

The authors highlight substantial improvements in both automated metrics, such as BLEU scores, and manual evaluations when compared to existing neural generation systems. By decoupling planning from realization, their system demonstrated enhanced reliability and adequacy while maintaining fluency equivalent to end-to-end neural systems. Importantly, the system also showcased an ability to produce diverse text outcomes based on varying plans—a feature overlooked by other single-process models.

Implications and Future Directions

Practical Implications: Practically, the paper implies that data-to-text systems can significantly benefit from incorporating structured planning stages prior to neural realization. This could improve text generation models in contexts where details need to be accurately captured from input data without compromising fluid language expression.

Theoretical Implications: Theoretically, separating planning from realization challenges the notion that neural networks should handle both aspects singularly. This opens avenues for potentially enriching neural models with complementary symbolic systems to handle high-level structuring, possibly leading to more interpretable AI workflows.

Speculation on AI Developments: Future developments in AI might explore broader applications of such system architectures, especially in generating reports, summarizing complex datasets, and responding to dynamic inputs in real-time applications. The integration of control over text outputs could pave the way for customizable human-machine interactions in journalism, data analysis, and automated content creation.

By unraveling the constraints on neural networks to capture both planning and realization simultaneously, the paper presents a compelling case for reformulating standard practices in data-driven text generation tasks. Further research might consider adaptations in other domains, pushing the boundaries of what neural models can achieve when complemented by robust structured symbolic methods.

PDF Markdown

Related Papers

Find Related Papers