2000 character limit reached
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation (1804.08207v2)
Published 23 Apr 2018 in cs.CL
Abstract: We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.
- Adam Poliak (17 papers)
- Aparajita Haldar (8 papers)
- Rachel Rudinger (46 papers)
- J. Edward Hu (5 papers)
- Ellie Pavlick (66 papers)
- Aaron Steven White (29 papers)
- Benjamin Van Durme (173 papers)