Papers
Topics
Authors
Recent
Search
2000 character limit reached

FreSh: A Lock-Free Data Series Index

Published 17 Oct 2023 in cs.DB and cs.DC | (2310.11602v1)

Abstract: We present FreSh, a lock-free data series index that exhibits good performance (while being robust). FreSh is based on Refresh, which is a generic approach we have developed for supporting lock-freedom in an efficient way on top of any localityaware data series index. We believe Refresh is of independent interest and can be used to get well-performed lock-free versions of other locality-aware blocking data structures. For developing FreSh, we first studied in depth the design decisions of current state-of-the-art data series indexes, and the principles governing their performance. This led to a theoretical framework, which enables the development and analysis of data series indexes in a modular way. The framework allowed us to apply Refresh, repeatedly, to get lock-free versions of the different phases of a family of data series indexes. Experiments with several synthetic and real datasets illustrate that FreSh achieves performance that is as good as that of the state-of-the-art blocking in-memory data series index. This shows that the helping mechanisms of FreSh are light-weight, respecting certain principles that are crucial for performance in locality-aware data structures.This paper was published in SRDS 2023.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. T. Palpanas, “Data series management: The road to big sequence analytics,” SIGMOD Record, 2015.
  2. A. J. Bagnall, R. L. Cole, T. Palpanas, and K. Zoumpatianos, “Data series management (dagstuhl seminar 19282),” Dagstuhl Reports, 9(7), 2019.
  3. T. Palpanas and V. Beckmann, “Report on the first and second interdisciplinary time series analysis workshop (ITISA),” SIGREC, 48(3), 2019.
  4. K. Echihabi, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “The lernaean hydra of data series similarity search: An experimental evaluation of the state of the art,” PVLDB, vol. 12, no. 2, 2018.
  5. A. Camerra, J. Shieh, T. Palpanas, T. Rakthanmanon, and E. Keogh, “Beyond One Billion Time Series: Indexing and Mining Very Large Time Series Collections with iSAX2+,” KAIS, vol. 39, no. 1, 2014.
  6. Y. Wang, P. Wang, J. Pei, W. Wang, and S. Huang, “A data-adaptive and dynamic segmentation index for whole matching on time series,” VLDB, 2013.
  7. B. Peng, P. Fatourou, and T. Palpanas, “Paris: The next destination for fast data series indexing and query answering,” IEEE BigData, 2018.
  8. ——, “Paris+: Data series indexing on multi-core architectures,” TKDE, 2020.
  9. ——, “Messi: In-memory data series indexing,” in ICDE, 2020.
  10. ——, “Fast data series indexing for in-memory data,” The VLDB Journal, vol. 30, no. 6, pp. 1041–1067, nov 2021.
  11. ——, “Sing: Sequence indexing using gpus,” in 2021 IEEE 37th International Conference on Data Engineering (ICDE), 2021, pp. 1883–1888.
  12. K. Echihabi, P. Fatourou, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “Hercules against data series similarity search,” Proc. VLDB Endow., vol. 15, no. 10, pp. 2005–2018, 2022. [Online]. Available: https://www.vldb.org/pvldb/vol15/p2005-echihabi.pdf
  13. Z. Wang, Q. Wang, P. Wang, T. Palpanas, and W. Wang, “Dumpy: A Compact and Adaptive Index for Large Data Series Collections,” in SIGMOD, 2023.
  14. T. Palpanas, “Evolution of a Data Series Index - The iSAX Family of Data Series Indexes,” in Communications in Computer and Information Science (CCIS), vol. 1197, 2020.
  15. K. Fraser, “Practical lock-freedom,” University of Cambridge Computer Laboratory, Tech. Rep. 579, 2004. [Online]. Available: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.pdf
  16. F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel, “Non-blocking binary search trees,” in Proc. 29th ACM Symposium on Principles of Distributed Computing, 2010, pp. 131–140.
  17. F. Ellen, P. Fatourou, J. Helga, and E. Ruppert, “The amortized complexity of non-blocking binary search trees,” in Proc. 33rd ACM Symposium on Principles of Distributed Computing, 2014, pp. 332–340.
  18. P. Fatourou and E. Ruppert, “Persistent non-blocking binary search trees supporting wait-free range queries,” CoRR, vol. abs/1805.04779, 2018. [Online]. Available: http://arxiv.org/abs/1805.04779
  19. H. Attiya, O. Ben-Baruch, P. Fatourou, D. Hendler, and E. Kosmas, “Detectable recovery of lock-free data structures,” in Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP ’22, 2022, pp. 262–277.
  20. P. Fatourou, N. D. Kallimanis, and T. Ropars, “An efficient wait-free resizable hash table,” in Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, ser. SPAA ’18.   New York, NY, USA: Association for Computing Machinery, 2018, pp. 111–120. [Online]. Available: https://doi.org/10.1145/3210377.3210408
  21. R. Izadpanah, S. Feldman, and D. Dechev, “A methodology for performance analysis of non-blocking algorithms using hardware and software metrics,” in 2016 IEEE 19th International Symposium on Real-Time Distributed Computing (ISORC), 2016, pp. 43–52.
  22. F. Ellen, P. Fatourou, J. Helga, and E. Ruppert, “The amortized complexity of non-blocking binary search trees,” in ACM Symposium on Principles of Distributed Computing, PODC ’14, Paris, France, July 15-18, 2014, 2014, pp. 332–340. [Online]. Available: https://doi.org/10.1145/2611462.2611486
  23. A. Natarajan, A. Ramachandran, and N. Mittal, “FEAST: a lightweight lock-free concurrent binary search tree,” ACM Transactions on Parallel Computing, vol. 7, no. 2, May 2020.
  24. E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra, “Dimensionality reduction for fast similarity search in large time series databases,” Knowl. Inf. Syst., vol. 3, no. 3, pp. 263–286, 2001.
  25. J. Shieh and E. Keogh, “iSAX: Indexing and Mining Terabyte Sized Time Series,” in SIGKDD, 2008.
  26. T. Rakthanmanon, B. J. L. Campana, A. Mueen, G. E. A. P. A. Batista, M. B. Westover, Q. Zhu, J. Zakaria, and E. J. Keogh, “Searching and mining trillions of time series subsequences under dynamic time warping,” in SIGKDD, 2012.
  27. K. Echihabi, K. Zoumpatianos, T. Palpanas, and H. Benbrahim, “Return of the lernaean hydra: Experimental evaluation of data series approximate similarity search,” Proc. VLDB Endow., vol. 13, no. 3, pp. 403–420, 2019.
  28. K. Echihabi, K. Zoumpatianos, and T. Palpanas, “Big sequence management: Scaling up and out,” in Proceedings of the 24th International Conference on Extending Database Technology, EDBT, 2021, pp. 714–717.
  29. K. Echihabi, T. Palpanas, and K. Zoumpatianos, “New trends in high-d vector similarity search: Ai-driven, progressive, and distributed,” Proc. VLDB Endow., vol. 14, no. 12, pp. 3198–3201, 2021. [Online]. Available: http://www.vldb.org/pvldb/vol14/p3198-echihabi.pdf
  30. I. Azizi, K. Echihabi, and T. Palpanas, “Elpis: Graph-based similarity search for scalable data science,” Proc. VLDB Endow., vol. 16, no. 6, pp. 1548–1559, 2023. [Online]. Available: https://www.vldb.org/pvldb/vol16/p1548-azizi.pdf
  31. O. Levchenko, B. Kolev, D. E. Yagoubi, R. Akbarinia, F. Masseglia, T. Palpanas, D. E. Shasha, and P. Valduriez, “Bestneighbor: efficient evaluation of knn queries on large time series databases,” Knowl. Inf. Syst., vol. 63, no. 2, pp. 349–378, 2021. [Online]. Available: https://doi.org/10.1007/s10115-020-01518-4
  32. A. Gogolou, T. Tsandilas, K. Echihabi, A. Bezerianos, and T. Palpanas, “Data series progressive similarity search with probabilistic quality guarantees,” in SIGMOD, 2020.
  33. J. Jo, J. Seo, and J. Fekete, “PANENE: A progressive algorithm for indexing and querying approximate k-nearest neighbors,” IEEE Trans. Vis. Comput. Graph., vol. 26, no. 2, pp. 1347–1360, 2020.
  34. C. Li, M. Zhang, D. G. Andersen, and Y. He, “Improving approximate nearest neighbor search through learned adaptive early termination,” in SIGMOD, 2020.
  35. K. Echihabi, T. Tsandilas, A. Gogolou, A. Bezerianos, and T. Palpanas, “Pros: data series progressive k-nn similarity search and classification with probabilistic quality guarantees,” VLDB J., vol. 32, no. 4, pp. 763–789, 2023. [Online]. Available: https://doi.org/10.1007/s00778-022-00771-z
  36. M. Chatzakis, P. Fatourou, E. Kosmas, T. Palpanas, and B. Peng, “Odyssey: A Journey in the Land of Distributed Data Series Similarity Search,” PVLDB, 2023.
  37. D. E. Yagoubi, R. Akbarinia, F. Masseglia, and T. Palpanas, “DPiSAX: Massively Distributed Partitioned iSAX,” in ICDM, 2017.
  38. D.-E. Yagoubi, R. Akbarinia, F. Masseglia, and T. Palpanas, “Massively distributed time series indexing and querying,” TKDE, vol. 32, no. 1, 2020.
  39. T. Brown, F. Ellen, and E. Ruppert, “A general technique for non-blocking trees,” in Proc. 19th ACM Symposium on Principles and Practice of Parallel Programming, 2014, pp. 329–342.
  40. M. He and M. Li, “Deletion without rebalancing in non-blocking binary search trees,” in Proc. 20th International Conference on Principles of Distributed Systems, 2016, pp. 34:1–34:17.
  41. H. Attiya, O. Ben-Baruch, P. Fatourou, D. Hendler, and E. Kosmas, “Tracking in order to recover: Dectable recovery of lock-free data structures,” in Proc. 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020, pp. 503–505.
  42. S. V. Howley and J. Jones, “A non-blocking internal binary search tree,” in Proc. 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012, pp. 161–171.
  43. B. Chatterjee, N. Nguyen, and P. Tsigas, “Efficient lock-free binary search trees,” in Proc. 33rd ACM Symposium on Principles of Distributed Computing, 2014, pp. 322–331.
  44. A. Braginsky and E. Petrank, “A lock-free B+tree,” in Proc. 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012.
  45. D. Alistarh, J. Kopinsky, J. Li, and N. Shavit, “The spraylist: A scalable relaxed priority queue,” in Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP 2015.   New York, NY, USA: Association for Computing Machinery, 2015, pp. 11–20. [Online]. Available: https://doi.org/10.1145/2688500.2688523
  46. A. Rukundo and P. Tsigas, “Tslqueue: An efficient lock-free design for priority queues,” in Euro-Par 2021: Parallel Processing, L. Sousa, N. Roma, and P. Tomás, Eds.   Cham: Springer International Publishing, 2021, pp. 385–401.
  47. M. Wimmer, J. Gruber, J. L. Träff, and P. Tsigas, “The lock-free k-lsm relaxed priority queue,” SIGPLAN Not., vol. 50, no. 8, pp. 277–278, jan 2015. [Online]. Available: https://doi.org/10.1145/2858788.2688547
  48. H. Sundell and P. Tsigas, “Fast and lock-free concurrent priority queues for multi-thread systems,” Journal of Parallel and Distributed Computing, vol. 65, no. 5, pp. 609–627, 2005. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0743731504002333
  49. O. Tamir, A. Morrison, and N. Rinetzky, “A Heap-Based Concurrent Priority Queue with Mutable Priorities for Faster Parallel Algorithms,” in OPODIS, vol. 46, 2016, pp. 1–16.
  50. J. Lindén and B. Jonsson, “A skiplist-based concurrent priority queue with minimal memory contention,” in Principles of Distributed Systems, R. Baldoni, N. Nisse, and M. van Steen, Eds.   Cham: Springer International Publishing, 2013, pp. 206–220.
  51. S. Timnat and E. Petrank, “A practical wait-free simulation for lock-free data structures,” in Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ser. PPoPP ’14.   New York, NY, USA: Association for Computing Machinery, 2014, pp. 357–368. [Online]. Available: https://doi.org/10.1145/2555243.2555261
  52. F. E. Fich, V. Luchangco, M. Moir, and N. Shavit, “Obstruction-free algorithms can be practically wait-free,” in Distributed Computing, P. Fraigniaud, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 78–92.
  53. R. Guerraoui, M. Kapałka, and P. Kouznetsov, “The weakest failure detectors to boost obstruction-freedom,” in Distributed Computing, S. Dolev, Ed.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 399–412.
  54. C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, “Fast subsequence matching in time-series databases,” in SIGMOD, New York, NY, USA, 1994.
  55. K. Zoumpatianos, Y. Lou, T. Palpanas, and J. Gehrke, “Query workloads for data series indexes,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, 2015, pp. 1603–1612. [Online]. Available: http://doi.acm.org/10.1145/2783258.2783382
  56. K. Zoumpatianos, Y. Lou, I. Ileana, T. Palpanas, and J. Gehrke, “Generating data series query workloads,” VLDB J., 2018.
  57. I. R. I. for Seismology with Artificial Intelligence, “Seismic Data Access,” http://ds.iris.edu/data/access/, 2018.
  58. S. Soldi, V. Beckmann, W. Baumgartner, G. Ponti, C. R. Shrader, P. Lubiński, H. Krimm, F. Mattana, and J. Tueller, “Long-term variability of agn at hard x-rays,” Astronomy & Astrophysics, vol. 563, 2014.
  59. F. Ellen, P. Fatourou, E. Ruppert, and F. van Breugel, “Non-blocking binary search trees,” in Proceedings of the 29th Annual ACM Symposium on Principles of Distributed Computing, PODC 2010, Zurich, Switzerland, July 25-28, 2010, 2010, pp. 131–140.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.