How random are a learner's mistakes? (0903.3667v5)
Abstract: Given a random binary sequence $X{(n)}$ of random variables, $X_{t},$ $t=1,2,...,n$, for instance, one that is generated by a Markov source (teacher) of order $k{*}$ (each state represented by $k{*}$ bits). Assume that the probability of the event $X_{t}=1$ is constant and denote it by $\beta$. Consider a learner which is based on a parametric model, for instance a Markov model of order $k$, who trains on a sequence $x{(m)}$ which is randomly drawn by the teacher. Test the learner's performance by giving it a sequence $x{(n)}$ (generated by the teacher) and check its predictions on every bit of $x{(n)}.$ An error occurs at time $t$ if the learner's prediction $Y_{t}$ differs from the true bit value $X_{t}$. Denote by $\xi{(n)}$ the sequence of errors where the error bit $\xi_{t}$ at time $t$ equals 1 or 0 according to whether the event of an error occurs or not, respectively. Consider the subsequence $\xi{(\nu)}$ of $\xi{(n)}$ which corresponds to the errors of predicting a 0, i.e., $\xi{(\nu)}$ consists of the bits of $\xi{(n)}$ only at times $t$ such that $Y_{t}=0.$ In this paper we compute an estimate on the deviation of the frequency of 1s of $\xi{(\nu)}$ from $\beta$. The result shows that the level of randomness of $\xi{(\nu)}$ decreases relative to an increase in the complexity of the learner.