Data Freshness with Low Replication Latency in Cloud-Native HTAP

Establish techniques in cloud-native HTAP databases to guarantee high data freshness at low replication latency in the compute layer when storage-layer logs have not yet been replayed under disaggregated compute and storage architectures.

Background

Cloud-native HTAP systems disaggregate compute and storage, often treating logs as first-class data. This can reduce data freshness if compute nodes depend on replaying storage-layer logs.

The authors explicitly identify ensuring high freshness with low replication latency in the compute layer as an open problem in cloud-native HTAP designs.

References

First, since the compute and storage are disaggregated, it is challenging to deliver a high data freshness if the log in the storage layer has not been replayed. Thus, how to guarantee the data freshness with a low replication latency for the compute layer is an open problem.

HTAP Databases: A Survey  (2404.15670 - Zhang et al., 2024) in Section 6 (Open Problems and Challenges)