Papers
Topics
Authors
Recent
2000 character limit reached

Pearson Distance is not a Distance

Published 15 Aug 2019 in stat.ME and stat.ML | (1908.06029v1)

Abstract: The Pearson distance between a pair of random variables $X,Y$ with correlation $\rho_{xy}$, namely, 1-$\rho_{xy}$, has gained widespread use, particularly for clustering, in areas such as gene expression analysis, brain imaging and cyber security. In all these applications it is implicitly assumed/required that the distance measures be metrics, thus satisfying the triangle inequality. We show however, that Pearson distance is not a metric. We go on to show that this can be repaired by recalling the result, (well known in other literature) that $\sqrt{1-\rho_{xy}}$ is a metric. We similarly show that a related measure of interest, $1-|\rho_{xy}|$, which is invariant to the sign of $\rho_{xy}$, is not a metric but that $\sqrt{1-\rho_{xy}2}$ is. We also give generalizations of these results.

Citations (12)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.