Building an OceanBase-based Distributed Nearly Real-time Analytical Processing Database System
Abstract: The growing demand for database systems capable of efficiently managing massive datasets while delivering real-time transaction processing and advanced analytical capabilities has become critical in modern data infrastructure. While traditional OLAP systems often fail to meet these dual requirements, emerging real-time analytical processing systems still face persistent challenges, such as excessive data redundancy, complex cross-system synchronization, and suboptimal temporal efficiency. This paper introduces OceanBase Mercury as an innovative OLAP system designed for petabyte-scale data. The system features a distributed, multi-tenant architecture that ensures essential enterprise-grade requirements, including continuous availability and elastic scalability. Our technical contributions include three key components: (1) an adaptive columnar storage format with hybrid data layout optimization, (2) a differential refresh mechanism for materialized views with temporal consistency guarantees, and (3) a polymorphic vectorization engine supporting three distinct data formats. Empirical evaluations under real-world workloads demonstrate that OceanBase Mercury outperforms specialized OLAP engines by 1.3X to 3.1X speedup in query latency while maintaining sub-second latency, positioning it as a groundbreaking AP solution that effectively balances analytical depth with operational agility in big data environments.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.