2000 character limit reached
Improving Semantic Composition with Offset Inference (1704.06692v1)
Published 21 Apr 2017 in cs.CL
Abstract: Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.
- Thomas Kober (12 papers)
- Julie Weeds (11 papers)
- Jeremy Reffin (5 papers)
- David Weir (15 papers)