event-icon
Description
We show that, during inference with Convolutional Neural Networks (CNNs), more than 2X to 8X ineffectual work can be found if instead of targeting those weights and activations that are zero, we target different combinations of value stream properties. We demonstrate a practical application with Tactical (TCT), a hardware accelerator which compared to an equivalent data-parallel accelerator for dense CNNs, improves performance by 15.8X and 8.4X for two popular image classification networks, AlexNet and GoogLeNet, that have been pruned. Further, it is 4.9X and 2.6X more energy efficient while requiring 48% more area.
Tags