Traffic Minimizing Caching and Latent Variable Distributions of Order Statistics (1704.04146v1)
Abstract: Given a statistical model for the request frequencies and sizes of data objects in a caching system, we derive the probability density of the size of the file that accounts for the largest amount of data traffic. This is equivalent to finding the required size of the cache for a caching placement that maximizes the expected byte hit ratio for given file size and popularity distributions. The file that maximizes the expected byte hit ratio is the file for which the product of its size and popularity is the highest -- thus, it is the file that incurs the greatest load on the network. We generalize this theoretical problem to cover factors and addends of arbitrary order statistics for given parent distributions. Further, we study the asymptotic behavior of these distributions. We give several factor and addend densities of widely-used distributions, and verify our results by extensive computer simulations.