April 1, 2019 –
Title: Computational Methods for the Analysis of Single Cell RNA-Seq Data.
PhD Candidate: Marmar Moussa
Major Advisor: Dr. Ion Mandoiu
Associate Advisors: Dr. Mukul Bansal, Dr. Sheida Nabavi
Day/Time: Monday, April 1st, 2019 10:00 AM
Location: Homer Babbidge Library Class of 1947 Conference Room
Single cell transcriptional profiling is critical for understanding cellular heterogeneity and identification of novel cell types as well as for studying growth and development of tissues and tumors. Leveraging the recent advances in single cell RNA sequencing (scRNA-Seq) technology requires novel computational methods that are robust to high levels of technical and biological noise and scale to datasets of millions of cells.
In this work, we address several challenges in the analysis work-flow of scRNA-seq data: First, we propose novel computational approaches for unsupervised clustering of scRNA-seq data based on the Term Frequency - Inverse Document Frequency (TF-IDF) (a transformation that has been successfully used in the field of text analysis). For this part, we present empirical experimental results showing that TF-IDF methods consistently outperform commonly used scRNA-Seq clustering approaches. Second, we study the so called ‘drop-out’ effect that is considered one of the most notable challenges in scRNA-Seq data analysis, where only a fraction of the transcriptome of each cell is captured. The random nature of drop-outs, however, makes it possible to consider imputation methods as means of correcting for drop-outs. In this part we study some existing scRNA-Seq imputation methods and propose a novel iterative imputation approach (LSImpute) based on efficiently computing highly similar cells using LSH (Locality Sensitive Hashing). We then present the results of a comprehensive assessment of existing and proposed methods on real scRNA-Seq datasets with varying per cell sequencing depths. Third, we present a computational method for assigning and/or ordering cells based on their cell-cycle stages from single-cell transcriptome data. And finally, we present a web-based interactive computational work-flow for the analysis and visualization of single-cell RNA-seq data.