Efficient Algorithms for Top-k Stabbing Queries on Weighted Interval Data (Full Version) (2405.05601v2)
Abstract: Intervals have been generated in many applications (e.g., temporal databases), and they are often associated with weights, such as prices. This paper addresses the problem of processing top-k weighted stabbing queries on interval data. Given a set of weighted intervals, a query value, and a result size $k$, this problem finds the $k$ intervals that are stabbed by the query value and have the largest weights. Although this problem finds practical applications (e.g., purchase, vehicle, and cryptocurrency analysis), it has not been well studied. A state-of-the-art algorithm for this problem incurs $O(n\log k)$ time, where $n$ is the number of intervals, so it is not scalable to large $n$. We solve this inefficiency issue and propose an algorithm that runs in $O(\sqrt{n }\log n + k)$ time. Furthermore, we propose an $O(\log n + k)$ algorithm to further accelerate the search efficiency. Experiments on two real large datasets demonstrate that our algorithms are faster than existing algorithms.
- An Optimal Dynamic Interval Stabbing-max Data Structure?. In SODA. 803–812.
- Daichi Amagata. 2024a. Independent Range Sampling on Interval Data. In ICDE. 449–461.
- Daichi Amagata. 2024b. Independent Range Sampling on Interval Data (Longer Version). arXiv:2405.08315 (2024).
- Daichi Amagata and Takahiro Hara. 2017. Mining Top-k Co-Occurrence Patterns across Multiple Streams. IEEE Transactions on Knowledge and Data Engineering 29, 10 (2017), 2249–2262.
- Sliding window top-k dominating query processing over distributed data streams. Distributed and Parallel Databases 34 (2016), 535–566.
- Efficient Algorithms for Top-k Stabbing Queries on Weighted Interval Data. In DEXA.
- Period Index: A Learned 2d Hash Index for Range and Duration Queries. In SSTD. 100–109.
- Hint: A Hierarchical Index for Intervals in Main Memory. In SIGMOD. 1257–1270.
- HINT: a Hierarchical interval index for Allen relationships. The VLDB Journal (2023), 1–28.
- Mark De Berg. 2000. Computational Geometry: Algorithms and Applications.
- Herbert Edelsbrunner. 1980. Dynamic Rectangle Intersection Searching.
- SAP HANA Database: Data Management for Modern Business Applications. SIGMOD Record 40, 4 (2012), 45–51.
- Dynamic rectangular intersection with priorities. In STOC. 639–648.
- Timeline Index: a Unified Data Structure for Processing Queries on Temporal Data in SAP HANA. In SIGMOD. 1173–1184.
- Lamps: Location-Aware Moving Top-k Pub/Sub. IEEE Transactions on Knowledge & Data Engineering 34, 01 (2022), 352–364.
- Range Thresholding on Streams. In SIGMOD. 571–582.
- Jianqiu Xu and Hua Lu. 2017. Efficiently answer top-k queries on typed intervals. Information Systems 71 (2017), 164–181.
- In-memory big data management and processing: A survey. IEEE Transactions on Knowledge and Data Engineering 27, 7 (2015), 1920–1948.
- Approximate Range Thresholding. In SIGMOD. 1108–1121.