April 25, 2019 –
Speaker: Caiwen Ding, Research Assistant, Northeastern University
Date: April 25, 2019
Location: HBL 1947 Room
Structured Representations in Deep Neural Network Systems: from Algorithms to Hardware Implementations
Deep learning systems have achieved unprecedented progress in many fields such as computer vision, natural language processing, game playing, unmanned driving and aerial systems. However, rapidly expanding model sizes are posing a significant restriction on weight storage, computation and energy efficiency, for both inference and training, and on both high-performance computing systems and low-power embedded system and IoT applications. In order to overcome these limitations, we propose a holistic framework of incorporating block-circulant matrices into deep learning systems, that can achieve: (i) simultaneous reduction on weight storage and computational complexity, (ii) simultaneous speedup of training and inference, and (iii) generality and fundamentality that can be adopted to different neural network types, sizes, and scales. We further use the alternating direction method of multipliers (ADMM) as an optimization method to train the block-circulant matrices-based DNNs, to save training effort and achieve better accuracy.
Besides algorithm-level achievements, our framework has a solid theoretical foundation to prove that our approach will converge to the same “effectiveness” as deep learning without compression. For hardware implementations, we (i) develop a high-level synthesis framework, to translate C to Verilog; (ii) propose a resource-aware, systematic quantization framework to maximally exploit hardware resource. The hardware implementations are based on a key principle of FFT-IFFT decoupling and the development of reconfigurable basic computing modules which can support different layers of DNNs, different DNN models and types, and different computing platforms (smartphones, FPGAs, ASICs). Our FPGA-based implementations for deep learning systems and LSTM-based recurrent neural networks could achieve up to 36X energy efficiency improvement compared with the state-of-the-art.