Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing (2305.06946v2)

Published 11 May 2023 in cs.AR

Abstract: The accuracy requirements in many scientific computing workloads result in the use of double-precision floating-point arithmetic in the execution kernels. Nevertheless, emerging real-number representations, such as posit arithmetic, show promise in delivering even higher accuracy in such computations. In this work, we explore the native use of 64-bit posits in a series of numerical benchmarks and compare their timing performance, accuracy and hardware cost to IEEE 754 doubles. In addition, we also study the conjugate gradient method for numerically solving systems of linear equations in real-world applications. For this, we extend the PERCIVAL RISC-V core and the Xposit custom RISC-V extension with posit64 and quire operations. Results show that posit64 can obtain up to 4 orders of magnitude lower mean square error than doubles. This leads to a reduction in the number of iterations required for convergence in some iterative solvers. However, leveraging the quire accumulator register can limit the order of some operations such as matrix multiplications. Furthermore, detailed FPGA and ASIC synthesis results highlight the significant hardware cost of 64-bit posit arithmetic and quire. Despite this, the large accuracy improvements achieved with the same memory bandwidth suggest that posit arithmetic may provide a potential alternative representation for scientific computing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. IEEE Computer Society, “IEEE Standard for Floating-Point Arithmetic,” IEEE Std 754-2019 (Revision of IEEE 754-2008), pp. 1–84, Jul. 2019.
  2. “BFloat16: The secret to high performance on Cloud TPUs,” https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus.
  3. P. Kharya, “NVIDIA Blogs: TensorFloat-32 Accelerates AI Training HPC upto 20x,” https://blogs.nvidia.com/blog/2020/05/14/tensorfloat-32-precision-format/, May 2020.
  4. D. Mallasén, R. Murillo, A. A. D. Barrio, G. Botella, L. Piñuel, and M. Prieto-Matias, “PERCIVAL: Open-Source Posit RISC-V Core With Quire Capability,” IEEE Transactions on Emerging Topics in Computing, vol. 10, no. 3, pp. 1241–1252, 2022.
  5. L.-N. Pouchet and T. Yuki, “PolyBench/C 4.2,” https://sourceforge.net/projects/polybench/, May 2016.
  6. Y. Durand, E. Guthmuller, C. Fuguet, J. Fereyre, A. Bocco, and R. Alidori, “Accelerating Variants of the Conjugate Gradient with the Variable Precision Processor,” in 2022 IEEE 29th Symposium on Computer Arithmetic (ARITH), Sep. 2022, pp. 51–57.
  7. S. Mach, F. Schuiki, F. Zaruba, and L. Benini, “FPnew: An Open-Source Multiformat Floating-Point Unit Architecture for Energy-Proportional Transprecision Computing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 4, pp. 774–787, Apr. 2021.
  8. Posit Working Group, “Standard for Posit Arithmetic (2022),” Feb. 2022. [Online]. Available: {https://posithub.org/docs/posit_standard-2.pdf}
  9. R. Murillo, D. Mallasén, A. A. Del Barrio, and G. Botella, “Comparing Different Decodings for Posit Arithmetic,” in Next Generation Arithmetic, J. Gustafson and V. Dimitrov, Eds.   Cham: Springer International Publishing, 2022, vol. 13253, pp. 84–99.
  10. Y. Uguen, L. Forget, and F. de Dinechin, “Evaluating the Hardware Cost of the Posit Number System,” in 2019 29th International Conference on Field Programmable Logic and Applications (FPL).   Barcelona, Spain: IEEE, Sep. 2019, pp. 106–113.
  11. R. Chaurasiya, J. Gustafson, R. Shrestha, J. Neudorfer, S. Nambiar, K. Niyogi, F. Merchant, and R. Leupers, “Parameterized Posit Arithmetic Hardware Generator,” in 2018 IEEE 36th International Conference on Computer Design (ICCD), Oct. 2018, pp. 334–341.
  12. M. Klöwer, P. D. Düben, and T. N. Palmer, “Posits as an alternative to floats for weather and climate models,” in Proceedings of the Conference for Next Generation Arithmetic 2019.   Singapore Singapore: ACM, Mar. 2019, pp. 1–8. [Online]. Available: https://dl.acm.org/doi/10.1145/3316279.3316281
  13. N. Neves, P. Tomás, and N. Roma, “Dynamic Fused Multiply-Accumulate Posit Unit with Variable Exponent Size for Low-Precision DSP Applications,” in 2020 IEEE Workshop on Signal Processing Systems (SiPS), Oct. 2020, pp. 1–6.
  14. A. Guntoro, C. De La Parra, F. Merchant, F. De Dinechin, J. L. Gustafson, M. Langhammer, R. Leupers, and S. Nambiar, “Next Generation Arithmetic for Edge Computing,” in 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).   Grenoble, France: IEEE, Mar. 2020, pp. 1357–1365.
  15. K. Asanović, R. Avizienis, J. Bachrach, S. Beamer, D. Biancolin, C. Celio, H. Cook, D. Dabbelt, J. Hauser, A. Izraelevitz, S. Karandikar, B. Keller, D. Kim, J. Koenig, Y. Lee, E. Love, M. Maas, A. Magyar, H. Mao, M. Moreto, A. Ou, D. A. Patterson, B. Richards, C. Schmidt, S. Twigg, H. Vo, and A. Waterman, “The rocket chip generator,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2016-17, Apr. 2016.
  16. F. Zaruba and L. Benini, “The Cost of Application-Class Processing: Energy and Performance Analysis of a Linux-Ready 1.7-GHz 64-Bit RISC-V Core in 22-nm FDSOI Technology,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 11, pp. 2629–2640, Nov. 2019.
  17. C. Celio, D. A. Patterson, and K. Asanović, “The berkeley out-of-order machine (BOOM): An industry-competitive, synthesizable, parameterized RISC-V processor,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2015-167, Jun. 2015.
  18. N. Gala, A. Menon, R. Bodduna, G. S. Madhusudan, and V. Kamakoti, “SHAKTI Processors: An Open-Source Hardware Initiative,” in 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID), Jan. 2016, pp. 7–8.
  19. M. V. Arunkumar, S. G. Bhairathi, and H. G. Hayatnagarkar, “PERC: Posit Enhanced Rocket Chip,” in 4th Workshop on Computer Architecture Research with RISC-V (CARRV’20), 2020, p. 8.
  20. S. Tiwari, N. Gala, C. Rebeiro, and V. Kamakoti, “PERI: A Configurable Posit Enabled RISC-V Core,” ACM Transactions on Architecture and Code Optimization, vol. 18, no. 3, pp. 1–26, Jun. 2021.
  21. N. N. Sharma, R. Jain, M. M. Pokkuluri, S. B. Patkar, R. Leupers, R. S. Nikhil, and F. Merchant, “CLARINET: A quire-enabled RISC-V-based framework for posit arithmetic empiricism,” Journal of Systems Architecture, p. 102801, Dec. 2022.
  22. M. Cococcioni, F. Rossi, E. Ruffaldi, and S. Saponara, “A Lightweight Posit Processing Unit for RISC-V Processors in Deep Neural Network Applications,” IEEE Transactions on Emerging Topics in Computing, no. 01, pp. 1–1, Oct. 2021.
  23. Q. Li, C. Fang, and Z. Wang, “PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications,” Feb. 2023.
  24. S. W. D. Chien, I. B. Peng, and S. Markidis, “Posit NPB: Assessing the Precision Improvement in HPC Scientific Applications,” in Parallel Processing and Applied Mathematics, R. Wyrzykowski, E. Deelman, J. Dongarra, and K. Karczewski, Eds.   Cham: Springer International Publishing, 2020, vol. 12043, pp. 301–310.
  25. N. Buoncristiani, S. Shah, D. Donofrio, and J. Shalf, “Evaluating the Numerical Stability of Posit Arithmetic,” in 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 2020, pp. 612–621.
  26. D. Mallasén Quintana, “Leveraging Posits for the Conjugate Gradient Linear Solver on an Application-Level RISC-V Core,” KTH Royal Institute of Technology, Tech. Rep., 2022.
  27. S. D. Ciocirlan, D. Loghin, L. Ramapantulu, N. Tapus, and Y. M. Teo, “The Accuracy and Efficiency of Posit Arithmetic,” arXiv:2109.08225 [cs], Sep. 2021.
  28. F. de Dinechin, L. Forget, J.-M. Muller, and Y. Uguen, “Posits: The good, the bad and the ugly,” in Proceedings of the Conference for next Generation Arithmetic 2019, ser. CoNGA’19.   New York, NY, USA: Association for Computing Machinery, 2019.
  29. J. L. Gustafson and I. T. Yonemoto, “Beating floating point at its own game: Posit arithmetic,” Supercomputing Frontiers and Innovations, vol. 4, no. 2, pp. 71–86, Apr. 2017.
  30. L. Forget, Y. Uguen, and F. de Dinechin, “Comparing posit and IEEE-754 hardware cost,” Apr. 2021.
  31. S. Jean, A. Raveendran, A. D. Selvakumar, G. Kaur, S. G. Dharani, S. G. Pattanshetty, and V. Desalphine, “P-FMA: A Novel Parameterized Posit Fused Multiply-Accumulate Arithmetic Processor,” in 2021 34th International Conference on VLSI Design and 2021 20th International Conference on Embedded Systems (VLSID), Feb. 2021, pp. 282–287.
  32. L. Ledoux and M. Casas, “A Generator of Numerically-Tailored and High-Throughput Accelerators for Batched GEMMs,” in 2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).   New York City, NY, USA: IEEE, May 2022, pp. 1–10.
  33. N. Neves, P. Tomás, and N. Roma, “A Reconfigurable Posit Tensor Unit with Variable-Precision Arithmetic and Automatic Data Streaming,” Journal of Signal Processing Systems, vol. 93, no. 12, pp. 1365–1385, Dec. 2021.
  34. W. Liu and A. Nannarelli, “Power efficient division and square root unit,” IEEE Transactions on Computers, vol. 61, no. 8, pp. 1059–1070, 2012.
  35. A. A. D. Barrio, R. Hermida, and S. O. Memik, “A partial carry-save on-the-fly correction multispeculative multiplier,” IEEE Transactions on Computers, vol. 65, no. 11, pp. 3251–3264, 2016.
  36. M. S. Kim, A. A. Del Barrio, L. T. Oliveira, R. Hermida, and N. Bagherzadeh, “Efficient Mitchell’s Approximate Log Multipliers for Convolutional Neural Networks,” IEEE Transactions on Computers, vol. 68, no. 5, pp. 660–675, 2019.
  37. R. Murillo, D. Mallasén, A. A. Del Barrio, and G. Botella, “Plaus: Posit logarithmic approximate units to implement low-cost operations with real numbers,” in Proceedings of the Conference for Next Generation Arithmetic 2023, ser. CoNGA’23, 2023.
  38. R. Murillo, A. A. Del Barrio, and G. Botella, “A Suite of Division Algorithms for Posit Arithmetic,” in 2023 IEEE 34th International Conference on Application-specific Systems, Architectures and Processors (ASAP).   Porto, Portugal: IEEE, Jul. 2023, pp. 41–44.
  39. K. Jun and E. E. Swartzlander, “Modified non-restoring division algorithm with improved delay profile and error correction,” in 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), 2012, pp. 1460–1464.
  40. E. T. L. Omtzigt, P. Gottschling, M. Seligman, and W. Zorn, “Universal Numbers Library: Design and implementation of a high-performance reproducible number systems library,” arXiv:2012.11011, 2020.
  41. M. S. Ansari, B. F. Cockburn, and J. Han, “An improved logarithmic multiplier for energy-efficient neural computing,” IEEE Transactions on Computers, vol. 70, no. 4, pp. 614–625, 2021.
  42. R. Murillo, A. A. Del Barrio Garcia, G. Botella, M. S. Kim, H. Kim, and N. Bagherzadeh, “PLAM: A Posit Logarithm-Approximate Multiplier,” IEEE Transactions on Emerging Topics in Computing, pp. 1–1, 2021.
  43. J. J. Dongarra, J. Du Croz, S. Hammarling, and I. S. Duff, “A set of level 3 basic linear algebra subprograms,” ACM Transactions on Mathematical Software, vol. 16, no. 1, pp. 1–17, Mar. 1990.
  44. X. Gao, S. Bayliss, and G. A. Constantinides, “Soap: Structural optimization of arithmetic expressions for high-level synthesis,” in 2013 International Conference on Field-Programmable Technology (FPT), 2013, pp. 112–119.
  45. J. Villalba-Moreno, J. Hormigo, and S. González-Navarro, “Unbiased rounding for hub floating-point addition,” IEEE Transactions on Computers, vol. 67, no. 9, pp. 1359–1365, 2018.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com