2000 character limit reached
Log-log Convexity of Type-Token Growth in Zipf's Systems (1412.4577v1)
Published 15 Dec 2014 in physics.soc-ph and physics.data-an
Abstract: It is traditionally assumed that Zipf's law implies the power-law growth of the number of different elements with the total number of elements in a system - the so-called Heaps' law. We show that a careful definition of Zipf's law leads to the violation of Heaps' law in random systems, and obtain alternative growth curves. These curves fulfill universal data collapses that only depend on the value of the Zipf's exponent. We observe that real books behave very much in the same way as random systems, despite the presence of burstiness in word occurrence. We advance an explanation for this unexpected correspondence.