2000 character limit reached
Emergence of order in random languages (1902.07516v3)
Published 20 Feb 2019 in cond-mat.dis-nn, cs.CL, and cs.FL
Abstract: We consider languages generated by weighted context-free grammars. It is shown that the behaviour of large texts is controlled by saddle-point equations for an appropriate generating function. We then consider ensembles of grammars, in particular the Random LLM of E. DeGiuli, Phys. Rev. Lett., 122, 128301, 2019. This model is solved in the replica-symmetric ansatz, which is valid in the high-temperature, disordered phase. It is shown that in the phase in which languages carry information, the replica symmetry must be broken.