September 25, 2017 –
Title: Applied Algorithms in Bioinformatics
Student: Sudipta Pathak
Major Advisor: Dr. Sanguthevar Rajasekaran
Associate Advisors: Dr. Reda Ammar, Dr. Swapna Gokhale
Date/Time: Monday, September 25th, 2017 at 9 am
Location: ITE 125
Drug-Target Interaction Prediction
Drug-Target Interaction (DTI) prediction is an active area of research. It has big impact on pharmaceutical research and drug repositioning. Improvement in DTI prediction accuracy could save lot of time, effort and money invested for drug discovery and drug repositioning experiments in wet labs. However, most of the algorithms proposed for DTI prediction suffer from poor accuracy and/or large runtime. Also, most of the algorithms with best results are not easily extended to distributed or parallel settings. In this paper, we propose shallow and deep learning based algorithms to address DTI problem. The shallow learning based algorithm uses ensemble learning approach to combine multiple heterogeneous information for identification of new interactions between drugs and targets. Empirical results on four data sets show that the proposed strategy has comparable or better prediction accuracy than contemporary methods. The deep neural network based approach uses autoencoder (AE) to improve prediction accuracy. The AE based algorithm is developed using well known deep learning library Keras and Tensorflow. Through extensive experiments we show that our results outperform all competing algorithms both in prediction accuracy and runtime.
Time series classification (TSC) is the problem of predicting class labels of time series generated by different signal sources. TSC has been a challenging problem in machine learning and statistics for many decades. TSC has many important applications in bioinformatics, biomedical engineering and clinical predictions. A large number of classification algorithm has been developed to address TSC problem. However, there is still enough room for improving accuracy of classification. Traditional approaches extract discriminative features from the time series data by applying different types of transformations. These features are then fed into standard classifiers for classification. After tremendous success of deep neural networks in certain areas some researchers applied deep convolutional neural networks and recurrent neural network based approaches for TSC. Deep neural network based algorithms established a new baseline for TSC. In this paper, we propose Ensemble Deep TimeNet (EDTNet), an ensemble of multiple deep neural networks for TSC. We compared EDTNet accuracy with state of art algorithms on 44 different datasets from UCR time series database. Through extensive experiments we show that EDTNet outperforms all the competing algorithms in most of the UCR datasets in terms of classification accuracy.
On Speeding-up Parallel Jacobi Iterations for SVDs
We live in an era of big data and the analysis of these data is becoming a bottleneck in many domains including biology and the internet. To make these analyses feasible in practice, we need efficient data reduction algorithms. The Singular Value Decomposition (SVD) is a data reduction technique that has been used in many different applications. For example, SVDs have been extensively used in text analysis. The best known sequential algorithms for the computation of SVDs take cubic time which may not be acceptable in practice. As a result, many parallel algorithms have been proposed in the literature. There are two kinds of algorithms for SVD, namely, QR decomposition and Jacobi iterations. Researchers have found out that even though QR is sequentially faster than Jacobi iterations, QR is difficult to parallelize. As a result, most of the parallel algorithms in the literature are based on Jacobi iterations. For example, the Jacobi Relaxation Scheme (JRS) of the classical Jacobi algorithm has been shown to be very effective in parallel. In this paper, we propose a novel variant of the classical Jacobi algorithm that is more efficient than the JRS algorithm. Our experimental results confirm this assertion. The key idea behind our algorithm is to select the pivot elements for each sweep appropriately. We also show how to efficiently implement our algorithm on such parallel models as the PRAM and the mesh.