Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing (1612.03164v1)

Published 9 Dec 2016 in cs.LG, cs.IT, math.IT, math.PR, math.ST, stat.ML, and stat.TH

Abstract: We show that the square Hellinger distance between two Bayesian networks on the same directed graph, $G$, is subadditive with respect to the neighborhoods of $G$. Namely, if $P$ and $Q$ are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, $H2(P,Q)$, between $P$ and $Q$ is upper bounded by the sum, $\sum_v H2(P_{{v} \cup \Pi_v}, Q_{{v} \cup \Pi_v})$, of the square Hellinger distances between the marginals of $P$ and $Q$ on every node $v$ and its parents $\Pi_v$ in the DAG. Importantly, our bound does not involve the conditionals but the marginals of $P$ and $Q$. We derive a similar inequality for more general Markov Random Fields. As an application of our inequality, we show that distinguishing whether two Bayesian networks $P$ and $Q$ on the same (but potentially unknown) DAG satisfy $P=Q$ vs $d_{\rm TV}(P,Q)>\epsilon$ can be performed from $\tilde{O}(|\Sigma|{3/4(d+1)} \cdot n/\epsilon2)$ samples, where $d$ is the maximum in-degree of the DAG and $\Sigma$ the domain of each variable of the Bayesian networks. If $P$ and $Q$ are defined on potentially different and potentially unknown trees, the sample complexity becomes $\tilde{O}(|\Sigma|{4.5} n/\epsilon2)$, whose dependence on $n, \epsilon$ is optimal up to logarithmic factors. Lastly, if $P$ and $Q$ are product distributions over ${0,1}n$ and $Q$ is known, the sample complexity becomes $O(\sqrt{n}/\epsilon2)$, which is optimal up to constant factors.

Citations (46)

Summary

We haven't generated a summary for this paper yet.