Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AxOMaP: Designing FPGA-based Approximate Arithmetic Operators using Mathematical Programming (2309.13445v1)

Published 23 Sep 2023 in cs.AR, cs.AI, and eess.SP

Abstract: With the increasing application of ML algorithms in embedded systems, there is a rising necessity to design low-cost computer arithmetic for these resource-constrained systems. As a result, emerging models of computation, such as approximate and stochastic computing, that leverage the inherent error-resilience of such algorithms are being actively explored for implementing ML inference on resource-constrained systems. Approximate computing (AxC) aims to provide disproportionate gains in the power, performance, and area (PPA) of an application by allowing some level of reduction in its behavioral accuracy (BEHAV). Using approximate operators (AxOs) for computer arithmetic forms one of the more prevalent methods of implementing AxC. AxOs provide the additional scope for finer granularity of optimization, compared to only precision scaling of computer arithmetic. To this end, designing platform-specific and cost-efficient approximate operators forms an important research goal. Recently, multiple works have reported using AI/ML-based approaches for synthesizing novel FPGA-based AxOs. However, most of such works limit usage of AI/ML to designing ML-based surrogate functions used during iterative optimization processes. To this end, we propose a novel data analysis-driven mathematical programming-based approach to synthesizing approximate operators for FPGAs. Specifically, we formulate mixed integer quadratically constrained programs based on the results of correlation analysis of the characterization data and use the solutions to enable a more directed search approach for evolutionary optimization algorithms. Compared to traditional evolutionary algorithms-based optimization, we report up to 21% improvement in the hypervolume, for joint optimization of PPA and BEHAV, in the design of signed 8-bit multipliers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Ternary neural networks for resource-efficient AI applications. In 2017 International Joint Conference on Neural Networks (IJCNN). 2547–2554. https://doi.org/10.1109/IJCNN.2017.7966166
  2. Francesco Biscani and Dario Izzo. 2020. A parallel global multiobjective framework for optimization: PAGMO. Journal of Open Source Software 5, 53 (2020), 2338. https://doi.org/10.21105/joss.02338
  3. SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy. In Proceedings of the 2020 on Great Lakes Symposium on VLSI (Virtual Event, China) (GLSVLSI ’20). Association for Computing Machinery, New York, NY, USA, 151–156. https://doi.org/10.1145/3386263.3406907
  4. DEAP: Evolutionary Algorithms Made Easy. J. Mach. Learn. Res. 13, 1 (jul 2012), 2171–2175.
  5. DRUM: A Dynamic Range Unbiased Multiplier for approximate applications. In 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 418–425. https://doi.org/10.1109/ICCAD.2015.7372600
  6. Hou-Jen Ko and Shen-Fu Hsiao. 2011. Design and Application of Faithfully Rounded and Truncated Multipliers With Combined Deletion, Reduction, Truncation, and Rounding. IEEE Transactions on Circuits and Systems II: Express Briefs 58, 5 (2011), 304–308. https://doi.org/10.1109/TCSII.2011.2148970
  7. Trading Accuracy for Power with an Underdesigned Multiplier Architecture. In 2011 24th Internatioal Conference on VLSI Design. 346–351. https://doi.org/10.1109/VLSID.2011.51
  8. Sparsh Mittal. 2016. A Survey of Techniques for Approximate Computing. ACM Comput. Surv. 48, 4, Article 62 (mar 2016), 33 pages. https://doi.org/10.1145/2893356
  9. AutoAx: An Automatic Design Space Exploration and Circuit Building Methodology Utilizing Libraries of Approximate Components. In Proceedings of the 56th Annual Design Automation Conference 2019 (Las Vegas, NV, USA) (DAC ’19). Association for Computing Machinery, New York, NY, USA, Article 123, 6 pages. https://doi.org/10.1145/3316781.3317781
  10. EvoApprox8b: Library of Approximate Adders and Multipliers for Circuit Design and Benchmarking of Approximation Methods. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. 258–261. https://doi.org/10.23919/DATE.2017.7926993
  11. Libraries of Approximate Circuits: Automated Design and Application in CNN Accelerators. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 10, 4 (2020), 406–418. https://doi.org/10.1109/JETCAS.2020.3032495
  12. Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs. In 2019 International Conference on Field-Programmable Technology (ICFPT). 307–310. https://doi.org/10.1109/ICFPT47387.2019.00054
  13. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
  14. Truncated Binary Multipliers With Variable Correction and Minimum Mean Square Error. IEEE Transactions on Circuits and Systems I: Regular Papers 57, 6 (2010), 1312–1325. https://doi.org/10.1109/TCSI.2009.2033536
  15. Aleksandra Płońska and Piotr Płoński. 2021. MLJAR: State-of-the-art Automated Machine Learning Framework for Tabular Data. Version 0.10.3. https://github.com/mljar/mljar-supervised
  16. ApproxFPGAs: Embracing ASIC-Based Approximate Arithmetic Components for FPGA-Based Systems. In Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference (Virtual Event, USA) (DAC ’20). IEEE Press, Article 118, 6 pages.
  17. DeMAS: An efficient design methodology for building approximate adders for FPGA-based systems. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). 917–920. https://doi.org/10.23919/DATE.2018.8342140
  18. CFU playground: Full-stack open-source framework for tiny machine learning (tinyml) acceleration on FPGAs. arXiv preprint arXiv:2201.01863 (2022).
  19. SyFAxO-GeN: Synthesizing FPGA-Based Approximate Operators with Generative Networks. In Proceedings of the 28th Asia and South Pacific Design Automation Conference (Tokyo, Japan) (ASPDAC ’23). Association for Computing Machinery, New York, NY, USA, 402–409. https://doi.org/10.1145/3566097.3567891
  20. Architectural-Space Exploration of Approximate Multipliers. In 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (Austin, TX, USA). IEEE Press, 1–8. https://doi.org/10.1145/2966986.2967005
  21. A low latency generic accuracy configurable adder. In 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1145/2744769.2744778
  22. Cross-layer approximate computing: From logic to architectures. In DAC.
  23. SMApproxLib: Library of FPGA-based Approximate Multipliers. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1109/DAC.2018.8465845
  24. High-Performance Accurate and Approximate Multipliers for FPGA-Based Hardware Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 2 (2022), 211–224. https://doi.org/10.1109/TCAD.2021.3056337
  25. AppAxO: Designing Application-Specific Approximate Operators for FPGA-Based Embedded Systems. ACM Trans. Embed. Comput. Syst. 21, 3, Article 29 (may 2022), 31 pages. https://doi.org/10.1145/3513262
  26. CLAppED: A Design Framework for Implementing Cross-Layer Approximation in FPGA-based Embedded Systems. In 2021 58th ACM/IEEE Design Automation Conference (DAC). 475–480. https://doi.org/10.1109/DAC18074.2021.9586260
  27. Area-Optimized Accurate and Approximate Softcore Signed Multiplier Architectures. IEEE Trans. Comput. 70, 3 (2021), 384–392. https://doi.org/10.1109/TC.2020.2988404
  28. Approximate Computing and the Quest for Computing Efficiency. In Proceedings of the 52nd Annual Design Automation Conference (San Francisco, California) (DAC ’15). Association for Computing Machinery, New York, NY, USA, Article 120, 6 pages. https://doi.org/10.1145/2744769.2751163
  29. Shibo Wang and Pankaj Kanwar. 2019. BFloat16: The secret to high performance on Cloud TPUs. Google Cloud Blog (2019).
  30. On reconfiguration-oriented approximate adder design and its application. In 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 48–54. https://doi.org/10.1109/ICCAD.2013.6691096
  31. Q8BERT: Quantized 8Bit BERT. In 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS). 36–39. https://doi.org/10.1109/EMC2-NIPS53020.2019.00016
Citations (1)

Summary

We haven't generated a summary for this paper yet.