February 4, 2020 –
Title: Effective Algorithms for the Closest Pair and Related Problems
PhD Candidate: Xingyu Cai
Major Advisor: Sanguthevar Rajasekaran
Associate Advisors: Yufeng Wu and Song Han
Date/Time: Tuesday, February 4, 2019 3:00PM
Location: HBL 1947 Conference Room
Abstract: The Closest Pair problem aims to identify the closest pair (using some similarity measure, e.g., Euclidean distance, Dynamic Time Warping distance, etc.) of points in a metric space. This is one of the fundamental problems that has a wide range of applications in data mining area, since most of the data can be represented in vector format residing in a high dimensional space, and we would like to identify the relationship among those data points.
Typical applications include but not limited to, social data analysis, user pattern identification, motif mining in biological data, data clustering, etc. This is a very classical problem and has been studied very well in the past decades.
In this thesis, we study the Closest Pair problem and its variants, and also bring the machine learning perspective to solve some closely related problems. In particular, we have proposed two approximate algorithms to efficiently address the Closest Pair of Points (CPP) problem, and one deterministic approach to solve the Closest Pair of Subsequences (CPS) problem, in the scope of Euclidean distance metric. In addition, to identify the closest subsequences in the time series data, we have proposed a learnable feature extractor embedded in an artificial neural network, to learn patterns under the Dynamic Time Warping measure. In the end, to speed up the inference speed of the proposed algorithm, we have also proposed a neural network pruning technique to obtain a smaller network with similar capacity. All the proposed methods are shown to have achieved the state-of-the-art performance in various standard benchmark datasets.