- This event has passed.
Ph.D. Proposal: Zigeng Wang
May 5 @ 9:30 am - 10:30 am EDT
Title: Efficient Techniques to Process Big Data
Ph.D. Candidate: Zigeng Wang
Major Advisor: Dr. Sanguthevar Rajasekaran
Associate Advisors: Dr. Caiwen Ding, Dr. Qian Yang
Review Committee Members: Dr. Reda Ammar, Dr. Nalini Ravishanker
Date/Time: Wednesday, May 5th, 2021, 9:30AM-10:30AM
Meeting number: 120 923 9596
Join by phone: +1-415-655-0002 US Toll
Access code: 120 923 9596
In the current era, we see an explosion in data volume, data diversity, and data dimensions. Processing big data demands huge computational resources and storage space. Big data bring a lot of difficulties and opportunities to computing. This proposal offers several techniques for efficiently processing and learning from big data:
The first technique we investigate is feature selection. Feature selection is a crucial problem in efficient machine learning, and it also greatly contributes to the explainability of machine-driven decisions. Methods, like decision trees and LASSO, can select features during training. However, these embedded approaches can only be applied to a small subset of machine learning models. Wrapper-based methods can select features independent of the machine learning models but they often suffer from a high computational cost. To enhance their efficiency, we have designed a wrapper-based randomized feature selection algorithm.
The second technique we explore is efficient time series spectrum analysis. Higher order spectra (HOS) are a powerful tool in nonlinear time series analysis and they have been extensively used as feature representations in data mining, communications and cosmology domains. However, HOS estimation suffers from a high computational cost and memory consumption, restricting its use in resource-limited and time-sensitive applications. We present a package of generic sequential and parallel algorithms for computation and memory efficient HOS estimations that can be employed on any parallel machine or platform.
Besides the theoretical research above, some collaborative works on machine learning applications will be briefly covered in the proposal presentation. These applications span material science, medical science and metagenomics.
Future work will include efficient learning of neural network substructures. We propose to design a differentiable sub-network search algorithm with dynamic layer-wise pruning strength adjustment leveraging the sensitive nature of Lagrangian multipliers.