Springen zu

Energy and Bandwidth Efficient Sparse Programmable Dataflow Accelerator

Schnelle Fakten

Interne Autorenschaft
- Felix Schneider, M.Sc.
- Prof. Dr. Michael Karagounis
Weitere Publizierende
Bhaskar Choubey
Veröffentlichung
- 2024
Zeitschrift/Zeitung
IEEE Transactions on Circuits and Systems (9)
Organisationseinheit
Elektrotechnik
Fachgebiete
- Allgemeine Elektrotechnik
Format
Journalartikel (Artikel)

Abstract

The high performance Neural Network accelerator architectures rely on either large external memory bandwidth and/or sparse computation paradigms which scale down unfavorably. Current state-of-the-art architectures also include on-chip SRAMs often more than 150 kB, establishing a lower bound on silicon area based on memory alone. This article presents an architecture which exploits programmable dataflow in combination with sparsity to make more efficient use of small on-chip memories. Its control logic supports an enlarged map space through uneven mappings, searched by a cost-model driven compiler to find points which allow prioritizing energy consumption and memory access over throughput. The problem of supporting sparse processing for flexible dataflows is circumvented by introducing an encoding scheme which provides sparsity metadata while still allowing random read and write access. Altogether, the system showcases enhanced performance in terms of external memory access while requiring only 51 kB of on-chip SRAM and small silicon area (0.5 mm2). The design achieves average energy efficiency of 4.4 TOPS/W and 9.7 inferences/s for a sparse AlexNet workload.

Schlagwörter

Bandwidth

Convolution

DNN accelerator

Encoding

Energy efficiency

Logic

Memory management

System-on-chip

edge AI

flexible dataflow

map space exploration

sparse processing

Fachhochschule Dortmund

Informationen für

Sprache

Energy and Bandwidth Efficient Sparse Programmable Dataflow Accelerator

Schnelle Fakten

Abstract

Schlagwörter

Weiterführende Informationen