Cost Models for Complex Queries in ILP-Based Column Selection

Ascertain effective cost functions that accurately estimate the execution cost of complex multi-table queries with operations such as joins to enable integer linear programming-based selection of in-memory columns in HTAP databases.

Background

The survey presents an ILP-based approach to select beneficial columns for in-memory storage by minimizing a cost function over query scans subject to memory constraints. While effective in simpler settings, the approach depends on accurate cost estimation.

The authors explicitly note that it is unclear how to estimate costs for complex multi-table queries with intricate operations, posing a challenge to the ILP method’s applicability.

References

However, it is unclear how the cost functions can estimate the cost of complex queries that involve multiple tables and complex operations.

HTAP Databases: A Survey  (2404.15670 - Zhang et al., 2024) in Section 4.2.1 (Data Organization: Primary Row Store with Selected Column Store)