Loading Events

« All Events

  • This event has passed.

Doctoral Dissertation Proposal: Shaoyi Huang

February 19 @ 9:30 am - 10:30 am EST

Title: Towards High Performance Model Inference and Training: From Algorithm to Hardware

Ph.D. Candidate: Shaoyi Huang

Major Advisor: Dr. Caiwen Ding

Co-Major Advisor: Dr. Omer Khan
Associate Advisors: Dr. Dongkuan Xu
Committee Members: Dr. Jinbo Bi, Dr. Yuan Hong
Date/Time: Monday, February 19th, 2024, 9:30 am


Location: HBL Class of 1947 Room

Meeting link: https://uconn-cmr.webex.com/meet/shh20007

Abstract 


In recent years, significant advancements in artificial intelligence have been driven by the development of Deep Neural Networks (DNNs) and Transformer-based models, including BERT, GPT-3, and other Large Language Models (LLMs). These technologies have catalyzed innovations in various fields such as autonomous driving, recommendation systems, and chatbot applications. The models are increasingly designed with deeper, more complex structures and require larger computational resources. As computational demands escalate, model sparsification has emerged as a promising method to reduce model size and computational load during execution. Given the evolution of high-performance computing platforms, particularly advanced GPUs, end-to-end DNNs runtime speedup with model sparsification is an ideal but difficult goal due to the intricacies involved in sparsity which may need the change of matrix and kernel settings.

In this proposal, I will present my works in model inference and training acceleration from both algorithm and hardware levels. It mainly focuses on three innovative aspects: (1) an advanced sparse progressive pruning method which show for the first time that reducing the risk of overfitting can help the effectiveness of pruning on language models; (2) an novel self-attention architecture with attention-specific primitives and an attention-aware pruning design for Transformer-based models inference acceleration; (3) our recent works on sparse training via weights importance exploitation and weights coverage exploration which unlock the sparsity potential and enable the different CNN and GNN models to achieve extremely high sparsity.

Details

Date:
February 19
Time:
9:30 am - 10:30 am EST
Website:
https://uconn-cmr.webex.com/meet/shh20007

Venue

HBL Class of 1947 Conference Room
UConn Library, 369 Fairfield Way, Unit 1005
Storrs, CT 06269 United States
+ Google Map
Phone
(860) 486-2518
View Venue Website

Connect With Us