TreeTracker Join: Simple, Optimal, Fast
Abstract: We present a novel linear-time acyclic join algorithm, TreeTracker Join (TTJ). The algorithm can be understood as the pipelined binary hash join with a simple twist: upon a hash lookup failure, TTJ resets execution to the binding of the tuple causing the failure, and removes the offending tuple from its relation. Compared to the best known linear-time acyclic join algorithm, Yannakakis's algorithm, TTJ shares the same asymptotic complexity while imposing lower overhead. Further, we prove that when measuring query performance by counting the number of hash probes, TTJ will match or outperform binary hash join on the same plan. This property holds independently of the plan and independently of acyclicity. We are able to extend our theoretical results to cyclic queries by introducing a new hypergraph decomposition method called tree convolution. Tree convolution iteratively identifies and contracts acyclic subgraphs of the query hypergraph. The method avoids redundant calculations associated with tree decomposition and may be of independent interest. Empirical results on TPC-H, the Join Order Benchmark, and the Star Schema Benchmark demonstrate favorable results.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.