AxOMaP: Designing FPGA-based Approximate Arithmetic Operators using Mathematical Programming (2309.13445v1)
Abstract: With the increasing application of ML algorithms in embedded systems, there is a rising necessity to design low-cost computer arithmetic for these resource-constrained systems. As a result, emerging models of computation, such as approximate and stochastic computing, that leverage the inherent error-resilience of such algorithms are being actively explored for implementing ML inference on resource-constrained systems. Approximate computing (AxC) aims to provide disproportionate gains in the power, performance, and area (PPA) of an application by allowing some level of reduction in its behavioral accuracy (BEHAV). Using approximate operators (AxOs) for computer arithmetic forms one of the more prevalent methods of implementing AxC. AxOs provide the additional scope for finer granularity of optimization, compared to only precision scaling of computer arithmetic. To this end, designing platform-specific and cost-efficient approximate operators forms an important research goal. Recently, multiple works have reported using AI/ML-based approaches for synthesizing novel FPGA-based AxOs. However, most of such works limit usage of AI/ML to designing ML-based surrogate functions used during iterative optimization processes. To this end, we propose a novel data analysis-driven mathematical programming-based approach to synthesizing approximate operators for FPGAs. Specifically, we formulate mixed integer quadratically constrained programs based on the results of correlation analysis of the characterization data and use the solutions to enable a more directed search approach for evolutionary optimization algorithms. Compared to traditional evolutionary algorithms-based optimization, we report up to 21% improvement in the hypervolume, for joint optimization of PPA and BEHAV, in the design of signed 8-bit multipliers.
- Ternary neural networks for resource-efficient AI applications. In 2017 International Joint Conference on Neural Networks (IJCNN). 2547–2554. https://doi.org/10.1109/IJCNN.2017.7966166
- Francesco Biscani and Dario Izzo. 2020. A parallel global multiobjective framework for optimization: PAGMO. Journal of Open Source Software 5, 53 (2020), 2338. https://doi.org/10.21105/joss.02338
- SIMDive: Approximate SIMD Soft Multiplier-Divider for FPGAs with Tunable Accuracy. In Proceedings of the 2020 on Great Lakes Symposium on VLSI (Virtual Event, China) (GLSVLSI ’20). Association for Computing Machinery, New York, NY, USA, 151–156. https://doi.org/10.1145/3386263.3406907
- DEAP: Evolutionary Algorithms Made Easy. J. Mach. Learn. Res. 13, 1 (jul 2012), 2171–2175.
- DRUM: A Dynamic Range Unbiased Multiplier for approximate applications. In 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 418–425. https://doi.org/10.1109/ICCAD.2015.7372600
- Hou-Jen Ko and Shen-Fu Hsiao. 2011. Design and Application of Faithfully Rounded and Truncated Multipliers With Combined Deletion, Reduction, Truncation, and Rounding. IEEE Transactions on Circuits and Systems II: Express Briefs 58, 5 (2011), 304–308. https://doi.org/10.1109/TCSII.2011.2148970
- Trading Accuracy for Power with an Underdesigned Multiplier Architecture. In 2011 24th Internatioal Conference on VLSI Design. 346–351. https://doi.org/10.1109/VLSID.2011.51
- Sparsh Mittal. 2016. A Survey of Techniques for Approximate Computing. ACM Comput. Surv. 48, 4, Article 62 (mar 2016), 33 pages. https://doi.org/10.1145/2893356
- AutoAx: An Automatic Design Space Exploration and Circuit Building Methodology Utilizing Libraries of Approximate Components. In Proceedings of the 56th Annual Design Automation Conference 2019 (Las Vegas, NV, USA) (DAC ’19). Association for Computing Machinery, New York, NY, USA, Article 123, 6 pages. https://doi.org/10.1145/3316781.3317781
- EvoApprox8b: Library of Approximate Adders and Multipliers for Circuit Design and Benchmarking of Approximation Methods. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. 258–261. https://doi.org/10.23919/DATE.2017.7926993
- Libraries of Approximate Circuits: Automated Design and Application in CNN Accelerators. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 10, 4 (2020), 406–418. https://doi.org/10.1109/JETCAS.2020.3032495
- Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs. In 2019 International Conference on Field-Programmable Technology (ICFPT). 307–310. https://doi.org/10.1109/ICFPT47387.2019.00054
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
- Truncated Binary Multipliers With Variable Correction and Minimum Mean Square Error. IEEE Transactions on Circuits and Systems I: Regular Papers 57, 6 (2010), 1312–1325. https://doi.org/10.1109/TCSI.2009.2033536
- Aleksandra Płońska and Piotr Płoński. 2021. MLJAR: State-of-the-art Automated Machine Learning Framework for Tabular Data. Version 0.10.3. https://github.com/mljar/mljar-supervised
- ApproxFPGAs: Embracing ASIC-Based Approximate Arithmetic Components for FPGA-Based Systems. In Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference (Virtual Event, USA) (DAC ’20). IEEE Press, Article 118, 6 pages.
- DeMAS: An efficient design methodology for building approximate adders for FPGA-based systems. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). 917–920. https://doi.org/10.23919/DATE.2018.8342140
- CFU playground: Full-stack open-source framework for tiny machine learning (tinyml) acceleration on FPGAs. arXiv preprint arXiv:2201.01863 (2022).
- SyFAxO-GeN: Synthesizing FPGA-Based Approximate Operators with Generative Networks. In Proceedings of the 28th Asia and South Pacific Design Automation Conference (Tokyo, Japan) (ASPDAC ’23). Association for Computing Machinery, New York, NY, USA, 402–409. https://doi.org/10.1145/3566097.3567891
- Architectural-Space Exploration of Approximate Multipliers. In 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (Austin, TX, USA). IEEE Press, 1–8. https://doi.org/10.1145/2966986.2967005
- A low latency generic accuracy configurable adder. In 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1145/2744769.2744778
- Cross-layer approximate computing: From logic to architectures. In DAC.
- SMApproxLib: Library of FPGA-based Approximate Multipliers. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1109/DAC.2018.8465845
- High-Performance Accurate and Approximate Multipliers for FPGA-Based Hardware Accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 2 (2022), 211–224. https://doi.org/10.1109/TCAD.2021.3056337
- AppAxO: Designing Application-Specific Approximate Operators for FPGA-Based Embedded Systems. ACM Trans. Embed. Comput. Syst. 21, 3, Article 29 (may 2022), 31 pages. https://doi.org/10.1145/3513262
- CLAppED: A Design Framework for Implementing Cross-Layer Approximation in FPGA-based Embedded Systems. In 2021 58th ACM/IEEE Design Automation Conference (DAC). 475–480. https://doi.org/10.1109/DAC18074.2021.9586260
- Area-Optimized Accurate and Approximate Softcore Signed Multiplier Architectures. IEEE Trans. Comput. 70, 3 (2021), 384–392. https://doi.org/10.1109/TC.2020.2988404
- Approximate Computing and the Quest for Computing Efficiency. In Proceedings of the 52nd Annual Design Automation Conference (San Francisco, California) (DAC ’15). Association for Computing Machinery, New York, NY, USA, Article 120, 6 pages. https://doi.org/10.1145/2744769.2751163
- Shibo Wang and Pankaj Kanwar. 2019. BFloat16: The secret to high performance on Cloud TPUs. Google Cloud Blog (2019).
- On reconfiguration-oriented approximate adder design and its application. In 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). 48–54. https://doi.org/10.1109/ICCAD.2013.6691096
- Q8BERT: Quantized 8Bit BERT. In 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS). 36–39. https://doi.org/10.1109/EMC2-NIPS53020.2019.00016