ELSE: Efficient Deep Neural Network Inference through Line-based Sparsity Exploration
Zeqi Zhu*, Alberto Garcia-Ortiz, Luc Waeijen, Egor Bondarev, Arash Pourtaherian, Orlando Moreira
;
Abstract
"Brain-inspired computer architecture facilitates low-power, low-latency deep neural network inference for embedded AI applications. The hardware performance crucially hinges on the quantity of non-zero activations (i.e., events) during inference. Thus, we propose a novel event suppression method, dubbed ELSE, which enhances inference Efficiency via Line-based Sparsity Exploration. Specifically, it exploits spatial correlation between adjacent lines in activation maps to reduce network events. ELSE reduces event-triggered computations by 3.14∼6.49× for object detection and by 2.43∼5.75× for pose estimation across various network architectures compared to conventional processing. Additionally, we show that combining ELSE with other event suppression methods can either significantly enhance computation savings for spatial suppression or reduce state memory footprint by > 2× for temporal suppression. The latter alleviates the challenge of temporal execution exceeding the resource constraints of real-world embedded platforms. These results highlight ELSE’s significant event suppression ability and its capacity to deliver complementary performance enhancements for SOTA methods."
Related Material
[pdf]
[supplementary material]
[DOI]