Papers
Topics
Authors
Recent
Search
2000 character limit reached

SFVInt: Simple, Fast and Generic Variable-Length Integer Decoding using Bit Manipulation Instructions

Published 11 Mar 2024 in cs.DB and cs.DC | (2403.06898v4)

Abstract: The ubiquity of variable-length integers in data storage and communication necessitates efficient decoding techniques. In this paper, we present SFVInt, a simple and fast approach to decode the prevalent Little Endian Base-128 (LEB128) varints. Our approach effectively utilizes the Bit Manipulation Instruction Set 2 (BMI2) in modern Intel and AMD processors, achieving significant performance improvement while maintaining simplicity and avoiding overengineering. SFVInt, with its generic design, effectively processes both 32-bit and 64-bit unsigned integers using a unified code template, marking a significant leap forward in varint decoding efficiency. We thoroughly evaluate SFVInt's performance across various datasets and scenarios, demonstrating that it achieves up to a 2x increase in decoding speed when compared to varint decoding methods used in established frameworks like Facebook Folly and Google Protobuf.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. [n. d.]. WebAssembly Build Suite. Retrieved Jan 2, 2024 from https://github.com/WebAssembly/build-suite
  2. 2013. x86 Bit manipulation instruction set. Retrieved Jan 6, 2023 from https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set
  3. 2017. Ryzen and BMI2: Strange behavior and high latencies. Retrieved Jan 2, 2024 from https://www.reddit.com/r/Amd/comments/60i6er/ryzen_and_bmi2_strange_behavior_and_high_latencies/
  4. AMD. 2023. AMD64 Architecture Programmer’s Manual. Retrieved Jan 6, 2023 from https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24594.pdf
  5. Efficient Index Compression in DB2 LUW. Proc. VLDB Endow. 2, 2 (aug 2009), 1462–1473. https://doi.org/10.14778/1687553.1687573
  6. BullFrog: Online Schema Evolution via Lazy Evaluation. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD ’21). Association for Computing Machinery, New York, NY, USA, 194–206. https://doi.org/10.1145/3448016.3452842
  7. World Wide Web Consortium. 2022. WebAssembly Binary Format. Retrieved Jan 6, 2023 from https://webassembly.github.io/spec/core/binary/values.html
  8. Jeffrey Dean. 2009. Challenges in building large-scale information retrieval systems: invited talk. In WSDM ’09: Proceedings of the Second ACM International Conference on Web Search and Data Mining. New York, NY, USA, 1–1. http://doi.acm.org/10.1145/1498759.1498761
  9. Facebook. 2012. Folly: An open-source C++ library developed and used at Facebook. Retrieved Dec 30, 2023 from https://github.com/facebook/folly
  10. Apache Software Foundation. 2011. Apache Lucene. Retrieved Jan 4, 2023 from https://lucene.apache.org/
  11. Apache Software Foundation. 2013a. Apache ORC: the smallest, fastest columnar storage for Hadoop workloads. Retrieved Dec 30, 2023 from https://orc.apache.org
  12. Apache Software Foundation. 2013b. Apache Parquet: an open source, column-oriented data file format. Retrieved Jan 4, 2023 from https://parquet.apache.org/
  13. Google. 2001. Protocol Buffers: Google’s data interchange format. Retrieved Dec 30, 2023 from https://github.com/protocolbuffers/protobuf
  14. Robert Griesemer. 2011. Support for varint encoding in Go. Retrieved Jan 6, 2023 from https://github.com/golang/go/commit/f30719dc89c2a41502fa584b790943170ad2d1ce
  15. InstLatX64. 2019. Achilles heel of AMD Zens: data dependency of PDEP/PEXT instructions. Retrieved Jan 6, 2023 from https://mobile.twitter.com/InstLatX64/status/1209095219087585281
  16. Intel Corporation. 2016. Intel® 64 and IA-32 Architectures Software Developer’s Manual. Retrieved Jan 6, 2023 from https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-2b-manual.pdf
  17. Kullberg. 2017. Ryzen Schach Performance - BMI2 Problem. Retrieved Jan 2, 2024 from https://www.hardwareluxx.de/community/threads/ryzen-schach-performance-bmi2-problem.1156117/
  18. Stream VByte: Faster byte-oriented integer compression. Inform. Process. Lett. 130 (Feb. 2018), 1–6. https://doi.org/10.1016/j.ipl.2017.09.011
  19. Selection Pushdown in Column Stores Using Bit Manipulation Instructions. Proc. ACM Manag. Data 1, 2, Article 178 (jun 2023), 26 pages. https://doi.org/10.1145/3589323
  20. Yinan Li and Jignesh M. Patel. 2013. BitWeaving: Fast Scans for Main Memory Data Processing. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (New York, New York, USA) (SIGMOD ’13). Association for Computing Machinery, New York, NY, USA, 289–300. https://doi.org/10.1145/2463676.2465322
  21. Gang Liao. 2022. The Evolution of Cloud Data Architectures: Storage, Compute, and Migration. https://drum.lib.umd.edu/items/e591f36a-a240-42db-8252-196ed4facee9.
  22. Gang Liao and Daniel J. Abadi. 2023. FileScale: Fast and Elastic Metadata Management for Distributed File Systems. In Proceedings of the 2023 ACM Symposium on Cloud Computing (, Santa Cruz, CA, USA,) (SoCC ’23). Association for Computing Machinery, New York, NY, USA, 459–474. https://doi.org/10.1145/3620678.3624784
  23. Flock: A Low-Cost Streaming Query Engine on FaaS Platforms. arXiv:2312.16735 [cs.DB] https://github.com/flock-lab/flock
  24. Todd Lipcon. 2020. Core algorithms for columnar serialization. Retrieved Jan 6, 2023 from https://github.com/apache/kudu/commit/0ba6cb8d6b38a992786e5027528349a43802fd31
  25. Kudu: Storage for fast analytics on fast data. Cloudera, inc 28 (2015), 36–77.
  26. Vectorized VByte Decoding. arXiv:1503.07387 [cs.IR]
  27. Orestis Polychroniou and Kenneth A. Ross. 2014. Vectorized Bloom Filters for Advanced SIMD Processors. In Proceedings of the Tenth International Workshop on Data Management on New Hardware (Snowbird, Utah) (DaMoN ’14). Association for Computing Machinery, New York, NY, USA, Article 6, 6 pages. https://doi.org/10.1145/2619228.2619234
  28. Orestis Polychroniou and Kenneth A. Ross. 2015. Efficient Lightweight Compression Alongside Fast Scans. In Proceedings of the 11th International Workshop on Data Management on New Hardware (Melbourne, VIC, Australia) (DaMoN’15). Association for Computing Machinery, New York, NY, USA, Article 9, 6 pages. https://doi.org/10.1145/2771937.2771943
  29. SIMD-Accelerated Regular Expression Matching. In Proceedings of the 12th International Workshop on Data Management on New Hardware (San Francisco, California) (DaMoN ’16). Association for Computing Machinery, New York, NY, USA, Article 8, 7 pages. https://doi.org/10.1145/2933349.2933357
  30. SIMD-Based Decoding of Posting Lists. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (Glasgow, Scotland, UK) (CIKM ’11). Association for Computing Machinery, New York, NY, USA, 317–326. https://doi.org/10.1145/2063576.2063627
  31. Zach Wegner. 2019. ZP7: Zach’s Peppy Parallel-Prefix-Popcountin’ PEXT/PDEP Polyfill. Retrieved Jan 6, 2023 from https://github.com/zwegner/zp7
  32. Wikipedia. 2023. LEB128 (Little Endian Base 128). Retrieved Jan 6, 2023 from https://en.wikipedia.org/wiki/LEB128
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 3 tweets with 1 like about this paper.