Nearest-Neighbours Estimators for Conditional Mutual Information (2403.00556v3)
Abstract: The conditional mutual information quantifies the conditional dependence of two random variables. It has numerous applications; it forms, for example, part of the definition of transfer entropy, a common measure of the causal relationship between time series. It does, however, require a lot of data to estimate accurately and suffers the curse of dimensionality, limiting its application in machine learning and data science. However, the Kozachenko-Leonenko approach can address this problem: it is possible, in this approach to define a nearest-neighbour estimator which depends only on the distance between data points and not on the dimension of the data. Furthermore, the bias can be calculated analytically for this estimator. Here this estimator is described and is tested on simulated data.
- Thomas Schreiber. Measuring information transfer. Physical Review Letters, 85(2):461, 2000.
- Synchronization as adjustment of information rates: Detection from bivariate time series. Physical Review E, 63(4):046211, 2001.
- Granger causality and transfer entropy are equivalent for Gaussian variables. Physical Review Letters, 103(23):238701, 2009.
- William McGill. Multivariate information transmission. Transactions of the IRE Professional Group on Information Theory, 4(4):93–111, 1954.
- Anthony J Bell. The co-information lattice. In Proceedings of the Fifth International Workshop on Independent Component Analysis and Blind Signal Separation: ICA, volume 2003. Citeseer, 2003.
- A kernel-based calculation of information on a metric space. Entropy, 15(10):4540–4552, 2013.
- Conor Houghton. Calculating mutual information for spike trains and other data with distances but no coordinates. Royal Society Open Science, 2(5):140391, 2015.
- LF Kozachenko and Nikolai N Leonenko. Sample estimate of the entropy of a random vector. Problemy Peredachi Informatsii, 23(2):9–16, 1987.
- Estimating mutual information. Physical Review E, 69(6):066138, 2004.
- Partial mutual information for coupling analysis of multivariate time series. Physical review letters, 99:204101, Nov 2007. doi: 10.1103/PhysRevLett.99.204101.
- Joseph T Lizier. Jidt: An information-theoretic toolkit for studying the dynamics of complex systems. Frontiers in Robotics and AI, 1:11, 2014.
- Assessing coupling dynamics from an ensemble of time series. Entropy, 17(4):1958–1970, 2015.
- Conor Houghton. Calculating the mutual information between two spike trains. Neural Computation, 31(2):330–343, 2019.
- A note on the unbiased estimation of mutual information. arXiv 2105.08682, 2021.
- Numerical recipes 3rd edition: The art of scientific computing. Cambridge university press, 2007.
- Handbook of mathematical functions with formulas, graphs, and mathematical tables, volume 55. US Government printing office, 1968.
- Causality detection based on information-theoretic approaches in time series analysis. Physics Reports, 441(1):1–46, 2007.
- Transfer entropy—a model-free measure of effective connectivity for the neurosciences. Journal of Computational Neuroscience, 30(1):45–67, 2011.
- Symbolic transfer entropy. Physical Review Letters, 100(15):158101, 2008.
- An introduction to transfer entropy. Springer, 2016.
- Metric-space analysis of spike trains: theory, algorithms and application. Network: Computation in Neural Systems, 8(2):127–164, 1997.
- MCW van Rossum. A novel spike distance. Neural Computation, 13(4):751–763, 2001.
- A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology, 241(2):252–261, 2006.
- Multivariate dependence and genetic networks inference. IET Systems Biology, 4(6):428–440, 2010.
- Nbit-a new information theory-based analysis of allosteric mechanisms reveals residues that underlie function in the leucine transporter leut. PLoS Computational Biology, 10(5):e1003603, 2014.