Building a Balanced k-d Tree with MapReduce (1512.06389v7)
Abstract: The original description of the k-d tree recognized that rebalancing techniques, such as are used to build an AVL tree or a red-black tree, are not applicable to a k-d tree. Hence, in order to build a balanced k-d tree, it is necessary to obtain all of the data prior to building the tree then to build the tree via recursive subdivision of the data. One algorithm for building a balanced k-d tree finds the median of the data for each recursive subdivision of the data and builds the tree in O(n log n) time. A new algorithm builds a balanced k-d tree by presorting the data in each of k dimensions prior to building the tree, then preserves the order of the k presorts during recursive subdivision of the data and builds the tree in O(kn log n) time. This new algorithm is amenable to execution via MapReduce and permits building and searching a k-d tree that is represented as a distributed graph.