FreSh: A Lock-Free Data Series Index
Abstract: We present FreSh, a lock-free data series index that exhibits good performance (while being robust). FreSh is based on Refresh, which is a generic approach we have developed for supporting lock-freedom in an efficient way on top of any localityaware data series index. We believe Refresh is of independent interest and can be used to get well-performed lock-free versions of other locality-aware blocking data structures. For developing FreSh, we first studied in depth the design decisions of current state-of-the-art data series indexes, and the principles governing their performance. This led to a theoretical framework, which enables the development and analysis of data series indexes in a modular way. The framework allowed us to apply Refresh, repeatedly, to get lock-free versions of the different phases of a family of data series indexes. Experiments with several synthetic and real datasets illustrate that FreSh achieves performance that is as good as that of the state-of-the-art blocking in-memory data series index. This shows that the helping mechanisms of FreSh are light-weight, respecting certain principles that are crucial for performance in locality-aware data structures.This paper was published in SRDS 2023.
- T. Palpanas, “Data series management: The road to big sequence analytics,” SIGMOD Record, 2015.
- A. J. Bagnall, R. L. Cole, T. Palpanas, and K. Zoumpatianos, “Data series management (dagstuhl seminar 19282),” Dagstuhl Reports, 9(7), 2019.
- T. Palpanas and V. Beckmann, “Report on the first and second interdisciplinary time series analysis workshop (ITISA),” SIGREC, 48(3), 2019.
- K. Echihabi, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “The lernaean hydra of data series similarity search: An experimental evaluation of the state of the art,” PVLDB, vol. 12, no. 2, 2018.
- A. Camerra, J. Shieh, T. Palpanas, T. Rakthanmanon, and E. Keogh, “Beyond One Billion Time Series: Indexing and Mining Very Large Time Series Collections with iSAX2+,” KAIS, vol. 39, no. 1, 2014.
- Y. Wang, P. Wang, J. Pei, W. Wang, and S. Huang, “A data-adaptive and dynamic segmentation index for whole matching on time series,” VLDB, 2013.
- B. Peng, P. Fatourou, and T. Palpanas, “Paris: The next destination for fast data series indexing and query answering,” IEEE BigData, 2018.
- ——, “Paris+: Data series indexing on multi-core architectures,” TKDE, 2020.
- ——, “Messi: In-memory data series indexing,” in ICDE, 2020.
- ——, “Fast data series indexing for in-memory data,” The VLDB Journal, vol. 30, no. 6, pp. 1041–1067, nov 2021.
- ——, “Sing: Sequence indexing using gpus,” in 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2021, pp. 1883–1888.
- K. Echihabi, P. Fatourou, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “Hercules against data series similarity search,” Proc. VLDB Endow., vol. 15, no. 10, pp. 2005–2018, 2022. [Online]. Available: https://www.vldb.org/pvldb/vol15/p2005-echihabi.pdf
- Z. Wang, Q. Wang, P. Wang, T. Palpanas, and W. Wang, “Dumpy: A Compact and Adaptive Index for Large Data Series Collections,” in SIGMOD, 2023.
- T. Palpanas, “Evolution of a Data Series Index - The iSAX Family of Data Series Indexes,” in Communications in Computer and Information Science (CCIS), vol. 1197, 2020.
- K. Fraser, “Practical lock-freedom,” University of Cambridge Computer Laboratory, Tech. Rep. 579, 2004. [Online]. Available: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf
- F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel, “Non-blocking binary search trees,” in Proc. 29th ACM Symposium on Principles of Distributed Computing, 2010, pp. 131–140.
- F. Ellen, P. Fatourou, J. Helga, and E. Ruppert, “The amortized complexity of non-blocking binary search trees,” in Proc. 33rd ACM Symposium on Principles of Distributed Computing, 2014, pp. 332–340.
- P. Fatourou and E. Ruppert, “Persistent non-blocking binary search trees supporting wait-free range queries,” CoRR, vol. abs/1805.04779, 2018. [Online]. Available: http://arxiv.org/abs/1805.04779
- H. Attiya, O. Ben-Baruch, P. Fatourou, D. Hendler, and E. Kosmas, “Detectable recovery of lock-free data structures,” in Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP ’22, 2022, pp. 262–277.
- P. Fatourou, N. D. Kallimanis, and T. Ropars, “An efficient wait-free resizable hash table,” in Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, ser. SPAA ’18. New York, NY, USA: Association for Computing Machinery, 2018, pp. 111–120. [Online]. Available: https://doi.org/10.1145/3210377.3210408
- R. Izadpanah, S. Feldman, and D. Dechev, “A methodology for performance analysis of non-blocking algorithms using hardware and software metrics,” in 2016 IEEE 19th International Symposium on Real-Time Distributed Computing (ISORC), 2016, pp. 43–52.
- F. Ellen, P. Fatourou, J. Helga, and E. Ruppert, “The amortized complexity of non-blocking binary search trees,” in ACM Symposium on Principles of Distributed Computing, PODC ’14, Paris, France, July 15-18, 2014, 2014, pp. 332–340. [Online]. Available: https://doi.org/10.1145/2611462.2611486
- A. Natarajan, A. Ramachandran, and N. Mittal, “FEAST: a lightweight lock-free concurrent binary search tree,” ACM Transactions on Parallel Computing, vol. 7, no. 2, May 2020.
- E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra, “Dimensionality reduction for fast similarity search in large time series databases,” Knowl. Inf. Syst., vol. 3, no. 3, pp. 263–286, 2001.
- J. Shieh and E. Keogh, “iSAX: Indexing and Mining Terabyte Sized Time Series,” in SIGKDD, 2008.
- T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh, “Searching and mining trillions of time series subsequences under dynamic time warping,” in SIGKDD, 2012.
- K. Echihabi, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “Return of the lernaean hydra: Experimental evaluation of data series approximate similarity search,” Proc. VLDB Endow., vol. 13, no. 3, pp. 403–420, 2019.
- K. Echihabi, K. Zoumpatianos, and T. Palpanas, “Big sequence management: Scaling up and out,” in Proceedings of the 24th International Conference on Extending Database Technology, EDBT, 2021, pp. 714–717.
- K. Echihabi, T. Palpanas, and K. Zoumpatianos, “New trends in high-d vector similarity search: Ai-driven, progressive, and distributed,” Proc. VLDB Endow., vol. 14, no. 12, pp. 3198–3201, 2021. [Online]. Available: http://www.vldb.org/pvldb/vol14/p3198-echihabi.pdf
- I. Azizi, K. Echihabi, and T. Palpanas, “Elpis: Graph-based similarity search for scalable data science,” Proc. VLDB Endow., vol. 16, no. 6, pp. 1548–1559, 2023. [Online]. Available: https://www.vldb.org/pvldb/vol16/p1548-azizi.pdf
- O. Levchenko, B. Kolev, D. E. Yagoubi, R. Akbarinia, F. Masseglia, T. Palpanas, D. E. Shasha, and P. Valduriez, “Bestneighbor: efficient evaluation of knn queries on large time series databases,” Knowl. Inf. Syst., vol. 63, no. 2, pp. 349–378, 2021. [Online]. Available: https://doi.org/10.1007/s10115-020-01518-4
- A. Gogolou, T. Tsandilas, K. Echihabi, A. Bezerianos, and T. Palpanas, “Data series progressive similarity search with probabilistic quality guarantees,” in SIGMOD, 2020.
- J. Jo, J. Seo, and J. Fekete, “PANENE: A progressive algorithm for indexing and querying approximate k-nearest neighbors,” IEEE Trans. Vis. Comput. Graph., vol. 26, no. 2, pp. 1347–1360, 2020.
- C. Li, M. Zhang, D. G. Andersen, and Y. He, “Improving approximate nearest neighbor search through learned adaptive early termination,” in SIGMOD, 2020.
- K. Echihabi, T. Tsandilas, A. Gogolou, A. Bezerianos, and T. Palpanas, “Pros: data series progressive k-nn similarity search and classification with probabilistic quality guarantees,” VLDB J., vol. 32, no. 4, pp. 763–789, 2023. [Online]. Available: https://doi.org/10.1007/s00778-022-00771-z
- M. Chatzakis, P. Fatourou, E. Kosmas, T. Palpanas, and B. Peng, “Odyssey: A Journey in the Land of Distributed Data Series Similarity Search,” PVLDB, 2023.
- D. E. Yagoubi, R. Akbarinia, F. Masseglia, and T. Palpanas, “DPiSAX: Massively Distributed Partitioned iSAX,” in ICDM, 2017.
- D.-E. Yagoubi, R. Akbarinia, F. Masseglia, and T. Palpanas, “Massively distributed time series indexing and querying,” TKDE, vol. 32, no. 1, 2020.
- T. Brown, F. Ellen, and E. Ruppert, “A general technique for non-blocking trees,” in Proc. 19th ACM Symposium on Principles and Practice of Parallel Programming, 2014, pp. 329–342.
- M. He and M. Li, “Deletion without rebalancing in non-blocking binary search trees,” in Proc. 20th International Conference on Principles of Distributed Systems, 2016, pp. 34:1–34:17.
- H. Attiya, O. Ben-Baruch, P. Fatourou, D. Hendler, and E. Kosmas, “Tracking in order to recover: Dectable recovery of lock-free data structures,” in Proc. 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020, pp. 503–505.
- S. V. Howley and J. Jones, “A non-blocking internal binary search tree,” in Proc. 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012, pp. 161–171.
- B. Chatterjee, N. Nguyen, and P. Tsigas, “Efficient lock-free binary search trees,” in Proc. 33rd ACM Symposium on Principles of Distributed Computing, 2014, pp. 322–331.
- A. Braginsky and E. Petrank, “A lock-free B+tree,” in Proc. 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012.
- D. Alistarh, J. Kopinsky, J. Li, and N. Shavit, “The spraylist: A scalable relaxed priority queue,” in Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP 2015. New York, NY, USA: Association for Computing Machinery, 2015, pp. 11–20. [Online]. Available: https://doi.org/10.1145/2688500.2688523
- A. Rukundo and P. Tsigas, “Tslqueue: An efficient lock-free design for priority queues,” in Euro-Par 2021: Parallel Processing, L. Sousa, N. Roma, and P. Tomás, Eds. Cham: Springer International Publishing, 2021, pp. 385–401.
- M. Wimmer, J. Gruber, J. L. Träff, and P. Tsigas, “The lock-free k-lsm relaxed priority queue,” SIGPLAN Not., vol. 50, no. 8, pp. 277–278, jan 2015. [Online]. Available: https://doi.org/10.1145/2858788.2688547
- H. Sundell and P. Tsigas, “Fast and lock-free concurrent priority queues for multi-thread systems,” Journal of Parallel and Distributed Computing, vol. 65, no. 5, pp. 609–627, 2005. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0743731504002333
- O. Tamir, A. Morrison, and N. Rinetzky, “A Heap-Based Concurrent Priority Queue with Mutable Priorities for Faster Parallel Algorithms,” in OPODIS, vol. 46, 2016, pp. 1–16.
- J. Lindén and B. Jonsson, “A skiplist-based concurrent priority queue with minimal memory contention,” in Principles of Distributed Systems, R. Baldoni, N. Nisse, and M. van Steen, Eds. Cham: Springer International Publishing, 2013, pp. 206–220.
- S. Timnat and E. Petrank, “A practical wait-free simulation for lock-free data structures,” in Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP ’14. New York, NY, USA: Association for Computing Machinery, 2014, pp. 357–368. [Online]. Available: https://doi.org/10.1145/2555243.2555261
- F. E. Fich, V. Luchangco, M. Moir, and N. Shavit, “Obstruction-free algorithms can be practically wait-free,” in Distributed Computing, P. Fraigniaud, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 78–92.
- R. Guerraoui, M. Kapałka, and P. Kouznetsov, “The weakest failure detectors to boost obstruction-freedom,” in Distributed Computing, S. Dolev, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 399–412.
- C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast subsequence matching in time-series databases,” in SIGMOD, New York, NY, USA, 1994.
- K. Zoumpatianos, Y. Lou, T. Palpanas, and J. Gehrke, “Query workloads for data series indexes,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, 2015, pp. 1603–1612. [Online]. Available: http://doi.acm.org/10.1145/2783258.2783382
- K. Zoumpatianos, Y. Lou, I. Ileana, T. Palpanas, and J. Gehrke, “Generating data series query workloads,” VLDB J., 2018.
- I. R. I. for Seismology with Artificial Intelligence, “Seismic Data Access,” http://ds.iris.edu/data/access/, 2018.
- S. Soldi, V. Beckmann, W. Baumgartner, G. Ponti, C. R. Shrader, P. Lubiński, H. Krimm, F. Mattana, and J. Tueller, “Long-term variability of agn at hard x-rays,” Astronomy & Astrophysics, vol. 563, 2014.
- F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel, “Non-blocking binary search trees,” in Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, PODC 2010, Zurich, Switzerland, July 25-28, 2010, 2010, pp. 131–140.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.