Optimizing Index Deployment Order for Evolving OLAP (Extended Version) (1107.3606v3)
Abstract: Query workloads and database schemas in OLAP applications are becoming increasingly complex. Moreover, the queries and the schemas have to continually \textit{evolve} to address business requirements. During such repetitive transitions, the \textit{order} of index deployment has to be considered while designing the physical schemas such as indexes and MVs. An effective index deployment ordering can produce (1) a prompt query runtime improvement and (2) a reduced total deployment time. Both of these are essential qualities of design tools for quickly evolving databases, but optimizing the problem is challenging because of complex index interactions and a factorial number of possible solutions. We formulate the problem in a mathematical model and study several techniques for solving the index ordering problem. We demonstrate that Constraint Programming (CP) is a more flexible and efficient platform to solve the problem than other methods such as mixed integer programming and A* search. In addition to exact search techniques, we also studied local search algorithms to find near optimal solution very quickly. Our empirical analysis on the TPC-H dataset shows that our pruning techniques can reduce the size of the search space by tens of orders of magnitude. Using the TPC-DS dataset, we verify that our local search algorithm is a highly scalable and stable method for quickly finding a near-optimal solution.