Loading Events

« All Events

  • This event has passed.

Ph.D. Proposal: Zigeng Wang

May 5 @ 9:30 am - 10:30 am EDT

 

Title: Efficient Techniques to Process Big Data

Ph.D. CandidateZigeng Wang

Major Advisor:  Dr. Sanguthevar Rajasekaran

Associate Advisors Dr. Caiwen Ding, Dr. Qian Yang

Review Committee Members: Dr. Reda Ammar, Dr. Nalini Ravishanker

Date/Time: Wednesday, May 5th, 2021, 9:30AM-10:30AM

Location

Meeting link: https://uconn-cmr.webex.com/uconn-cmr/j.php?MTID=m4ed69974ecf4ef1e43404da35dcd7e45

Meeting number: 120 923 9596

Password: kvTDFmqQ522

Join by phone: +1-415-655-0002 US Toll

Access code: 120 923 9596

 

Abstract:

 

In the current era, we see an explosion in data volume, data diversity, and data dimensions. Processing big data demands huge computational resources and storage space. Big data bring a lot of difficulties and opportunities to computing. This proposal offers several techniques for efficiently processing and learning from big data:

The first technique we investigate is feature selection. Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees and LASSO, can select features during training. However, these embedded approaches can only be applied to a small subset of machine learning models. Wrapper-based methods can select features independent of the machine learning models but they often suffer from a high computational cost. To enhance their efficiency, we have designed a wrapper-based randomized feature selection algorithm.

The second technique we explore is efficient time series spectrum analysis. Higher order spectra (HOS) are a powerful tool in nonlinear time series analysis and they have been extensively used as feature representations in data mining, communications and cosmology domains. However, HOS estimation suffers from a high computational cost and memory consumption, restricting its use in resource-limited and time-sensitive applications.  We present a package of generic sequential and parallel algorithms for computation and memory efficient HOS estimations that can be employed on any parallel machine or platform.

Besides the theoretical research above, some collaborative works on machine learning applications will be briefly covered in the proposal presentation. These applications span material science, medical science and metagenomics.

Future work will include efficient learning of neural network substructures. We propose to design a differentiable sub-network search algorithm with dynamic layer-wise pruning strength adjustment leveraging the sensitive nature of Lagrangian multipliers.

Details

Date:
May 5
Time:
9:30 am - 10:30 am EDT

Connect With Us