Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts (2108.07535v2)

Published 17 Aug 2021 in cs.CL

Abstract: Many generation tasks follow a one-to-many mapping relationship: each input could be associated with multiple outputs. Existing methods like Conditional Variational AutoEncoder(CVAE) employ a latent variable to model this one-to-many relationship. However, this high-dimensional and dense latent variable lacks explainability and usually leads to poor and uncontrollable generations. In this paper, we innovatively introduce the linguistic concept of pattern to decompose the one-to-many mapping into multiple one-to-one mappings and further propose a model named Sparse Pattern Mixture of Experts(SPMoE). Each one-to-one mapping is associated with a conditional generation pattern and is modeled with an expert in SPMoE. To ensure each language pattern can be exclusively handled with an expert model for better explainability and diversity, a sparse mechanism is employed to coordinate all the expert models in SPMoE. We assess the performance of our SPMoE on the paraphrase generation task and the experiment results prove that SPMoE can achieve a good balance in terms of quality, pattern-level diversity, and corpus-level diversity.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shaobo Cui (15 papers)
  2. Xintong Bao (3 papers)
  3. Xuming Lin (4 papers)
  4. Zhongzhou Zhao (16 papers)
  5. Ji Zhang (176 papers)
  6. Wei Zhou (311 papers)
  7. Haiqing Chen (29 papers)

Summary

We haven't generated a summary for this paper yet.