View in EDS

A framework for designing power-efficient inference accelerators in tree-based learning applications.

Saved in:

Bibliographic Details
Title:	A framework for designing power-efficient inference accelerators in tree-based learning applications.
Authors:	Abreu, Brunno¹ (AUTHOR) baabreu@inf.ufrgs.br, Grellert, Mateus² (AUTHOR), Bampi, Sergio¹ (AUTHOR)
Source:	Engineering Applications of Artificial Intelligence. Mar2022, Vol. 109, pN.PAG-N.PAG. 1p.
Subjects:	Random forest algorithms, Decision trees, Hawthorns, Machine learning, Feature selection
Abstract:	Machine Learning techniques (ML) are being widely adopted in embedded devices due to their efficiency and flexibility. However, the strict power limitations in such devices, combined with the variable resource requirements of ML models, require further understanding of how model complexity affects power and performance. This paper proposes a framework that facilitates the design space exploration of dedicated decision trees (DT) and random forests (RF) accelerators by enabling a joint assessment of power dissipation and prediction accuracy. The proposed framework translates tree-based structures to hardware description languages (HDL). The HDL modules are submitted to logic and physically-aware hardware synthesis flows, allowing a detailed power-performance analysis of VLSI DTs and RFs. Using four data sets of embedded applications as case studies, we found that quantizing the input features leads to accuracy gains of up to 6.3% compared with the precise versions. We also show that using shallower trees may lead to small prediction loss with significant reductions in power, which is favorable for power-constrained applications. Our translator achieves better results in terms of energy/inference w.r.t. prior related works under comparison, one of which employed standard methods for hardware translation such as High-Level Synthesis. The proposed solution presents a power reduction of 10 times or more for the same inference throughput reported in prior works. [ABSTRACT FROM AUTHOR]
	Copyright of Engineering Applications of Artificial Intelligence is the property of Pergamon Press - An Imprint of Elsevier Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database:	Engineering Source

Description
Abstract:	Machine Learning techniques (ML) are being widely adopted in embedded devices due to their efficiency and flexibility. However, the strict power limitations in such devices, combined with the variable resource requirements of ML models, require further understanding of how model complexity affects power and performance. This paper proposes a framework that facilitates the design space exploration of dedicated decision trees (DT) and random forests (RF) accelerators by enabling a joint assessment of power dissipation and prediction accuracy. The proposed framework translates tree-based structures to hardware description languages (HDL). The HDL modules are submitted to logic and physically-aware hardware synthesis flows, allowing a detailed power-performance analysis of VLSI DTs and RFs. Using four data sets of embedded applications as case studies, we found that quantizing the input features leads to accuracy gains of up to 6.3% compared with the precise versions. We also show that using shallower trees may lead to small prediction loss with significant reductions in power, which is favorable for power-constrained applications. Our translator achieves better results in terms of energy/inference w.r.t. prior related works under comparison, one of which employed standard methods for hardware translation such as High-Level Synthesis. The proposed solution presents a power reduction of 10 times or more for the same inference throughput reported in prior works. [ABSTRACT FROM AUTHOR]
ISSN:	09521976
DOI:	10.1016/j.engappai.2021.104638