Ph.D. Defense: Xia Xiao
October 30 @ 1:00 pm - 3:00 pm EDT
Title:The Speedup Techniques for Deep Neural Networks and its Applications
Ph.D. Candidate: Xia Xiao
Major Advisor: Dr. Sanguthevar Rajasekaran
Associate Advisors: Dr. Jinbo Bi, Dr. Qian Yang.
Date/Time: Friday Oct 30, 2020 1:00 pm-3:00 pm
Meeting number: 120 921 1057
Join by phone: +1-415-655-0002 US Toll
Access code: 120 921 1057
Deep neural networks (DNNs) have achieved significant success in many applications, such as computer vision, natural language processing, robots, and self-driving cars. With the growing demand for more complex real-world applications, more complicated neural networks have been proposed. However, high capacity models result in two major problems: long training times and high inference delays, making the neural networks hard to train and infeasible to deploy for time-intensive applications or resource-limited devices. In this work, we propose multiple techniques to accelerate the training and inference speed as well as model performance
The first technique we study is model parallelization on generative adversarial networks (GANs). Multiple orthogonal generators with shared memory are employed to capture the whole data distribution space. This method can not only improve the model performance but also alleviate the mode collapse problem that is common in GANs. The second technique we investigate is the automatic network pruning. To reduce the floating-point operations (FLOPs) to a proper level without compromising accuracy, we propose a better generalized and easy-to-use pruning method, which prunes the network through optimizing a set of trainable auxiliary parameters instead of original weights. Weakly coupled gradient update rules are proposed to keep consistency with pruning tasks. The third technique is to remove the redundancy of the complicated model based on the need of applications. We treat the chemical reaction prediction as a translation problem and apply a low capacity neuron translation model to this problem. The fourth technique is to combine distillation with Differentiable Architecture Search to stabilize and improve the searching procedure. Intermediate results as well as the output logits are transferred from the teacher network to the student network. For the application of the speedup technique, we introduce neural network pruning into Materials Genomics. We propose attention based AutoPrune for the kernel pruning of a continuous filtering neural network for molecular property prediction and achieves better performance and more compact size.