Guesswork, large deviations and Shannon entropy (1205.4135v2)
Abstract: How hard is it guess a password? Massey showed that that the Shannon entropy of the distribution from which the password is selected is a lower bound on the expected number of guesses, but one which is not tight in general. In a series of subsequent papers under ever less restrictive stochastic assumptions, an asymptotic relationship as password length grows between scaled moments of the guesswork and specific R\'{e}nyi entropy was identified. Here we show that, when appropriately scaled, as the password length grows the logarithm of the guesswork satisfies a Large Deviation Principle (LDP), providing direct estimates of the guesswork distribution when passwords are long. The rate function governing the LDP possess a specific, restrictive form that encapsulates underlying structure in the nature of guesswork. Returning to Massey's original observation, a corollary to the LDP shows that expectation of the logarithm of the guesswork is the specific Shannon entropy of the password selection process.