Tree Oriented Data Analysis (1409.5501v2)
Abstract: Complex data objects arise in many areas of modern science including evolutionary biology, nueroscience, dynamics of gene expression and medical imaging. Object oriented data analysis (OODA) is the statistical analysis of datasets of complex objects. Data analysis of tree data objects is an exciting research area with interesting questions and challenging problems. This thesis focuses on tree oriented statistical methodologies, and algorithms for solving related mathematical optimization problems. This research is motivated by the goal of analyzing a data set of images of human brain arteries. The approach we take here is to use a novel representation of brain artery systems as points in phylogenetic treespace. The treespace property of unique global geodesics leads to a notion of geometric center called a Fr\'echet mean. For a sample of data points, the Fr\'echet function is the sum of squared distances from a point to the data points, and the Fr\'echet mean is the minimizer of the Fr\'echet function. In this thesis we use properties of the Fr\'echet function to develop an algorithmic system for computing Fr\'echet means. Properties of the Fr\'echet function are also used to show a sticky law of large numbers which describes a surprising stability of the topological tree structure of sample Fr\'echet means at that of the population Fr\'echet mean. We also introduce non-parametric regression of brain artery tree structure as a response variable to age based on weighted Fr\'echet means.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.