2000 character limit reached
A Corpus-based Toy Model for DisCoCat (1605.04013v2)
Published 13 May 2016 in cs.CL, cs.LO, and math.CT
Abstract: The categorical compositional distributional (DisCoCat) model of meaning rigorously connects distributional semantics and pregroup grammars, and has found a variety of applications in computational linguistics. From a more abstract standpoint, the DisCoCat paradigm predicates the construction of a mapping from syntax to categorical semantics. In this work we present a concrete construction of one such mapping, from a toy model of syntax for corpora annotated with constituent structure trees, to categorical semantics taking place in a category of free R-semimodules over an involutive commutative semiring R.