BEGIN:VCALENDAR
VERSION:2.0
METHOD:PUBLISH
PRODID:-//Accessible Web Design//My Calendar//http://www.joedolson.com//v2.5.16//EN
BEGIN:VEVENT
UID:90-79
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:PhD Proposal - Timothy Becker
DTSTAMP:20171101T113000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171101T113000
DTEND:20171101T123000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Machine Learning methods for Complex Structural Variation analysisStudent: Timothy BeckerMajor Advisor: Dr. Dong-Guk ShinAssociate Advisors: Dr. Yufeng Wu, Dr. Ion MandoiuDate/Time: Wednesday, November 1st, 2017 at 11:30am in Babbidge 1947 meeting roomAbstract:Detecting variations larger than 50 nucleotide bases called Structural Variants (SV) with current DNA sequencing technology remains challenging in normal tissues, but becomes more problematic with the increased heterogeneity and allele complexity found in tumor tissues. We focus on three areas: (1) Multi-input SV arbitration methodThe first part of the dissertation describes a multi-input SV fusion method (FusorSV) that uses features and prior knowledge to produce a comprehensive and arbitrated call set. We show that this approach works well on deletion, duplication and inversion call types in germline data by constructing a fully automated SV calling engine (SVE) that runs eight popular calling algorithms and utilizes the freely available 1000 Genomes Phase 3 high coverage data set. By focusing on the SV type and length as features, FusorSV outperformed existing algorithms based on 1000 rounds of permutation testing and had a concordantly high in vitro validation rate in excess of 85% for novel SV events. (2) Somatic genome generation methodThe second area details a genome generator (soMaCX) that models somatic evolution from sub clonal to cancer stem cell instances under continuous control. Joint SV distributions are constructed from SV type, size, complexity and region controls. Gain and loss of function are modeled by considering the SV type and its positive or negative effect after transcription, therein providing the needed mechanism to simulate selective pressure to ONCO genes and replication regions like the NHEJ pathway. To provide user control of sample purity, reads are simulated for both normal and somatic tissues and resulting data is randomly sampled. (3) Sequence feature extraction method and applicationThe final part of this dissertation comprises a sequence feature extraction framework (SAFE) and its application to somatic SV analysis. We propose a genomic signal processor framework that abstracts and transforms sequences and alignment entries into feature vectors such as read depth, split read depth, clipped read depth, supplemental read depth, strand bias, k-mer frequency and nucleic acid proportion. Integral to this framework will be an out-of-core data structure that will offer efficient random access, normalization and indexing on large data sets. We will then prove effectiveness by application to SV allele complexity and heterogeneity using machine learning methods in conjunction with SAFE.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:91-80
LOCATION:
SUMMARY:PhD Proposal - Abdullah Al-Mamun
DTSTAMP:20171127T093000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171127T093000
DTEND:20171127T103000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Novel Algorithms for Some Fundamental Big Data ProblemsStudent: Abdullah-Al MamunMajor Advisor: Dr. Sanguthevar RajasekaranAssociate Advisors: Dr. Reda Ammar, Dr. Ion MandoiuDate/Time: Monday, Nov 27th, 2017 at 9:30 amLocation: ITE 125 Abstract:In this digital era data sets are growing rapidly. Storing, processing, and analyzing large volume of data require efficient techniques. My proposal will focus the following three areas: K-mer counting problem:A massive number of bioinformatics applications require counting of k-length substrings in genetically important long strings. Genome assembly, repeat detection, multiple sequence alignment, error detection, and many other related applications use a k-mer counter as a building block. Very fast and efficient algorithms are necessary to count k-mers in large data sets to be useful in such applications. We propose a novel trie-based algorithm for this k-mer counting problem. Record linkage problem:Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. Here we have come up with efficient sequential and parallel algorithms for record linkage which can handle any number of datasets. Our methods employ single linkage as well as complete linkage hierarchical clustering to address this problem. Problems with algorithmic challenges:Finding minimum spanning trees (MST) in various types of networks is a well-studied problem in theory and practical applications. We have devised a very efficient algorithm which combines ideas from randomized selection, Kruskal’s algorithm and Prim’s algorithm.Algorithms for finding the closest l-mers have been used in solving the (l, d)-motif search problem. We propose novel exact and approximate algorithms for this problem for the special case of m = 3.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:92-81
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:PhD Proposal - Timothy Becker
DTSTAMP:20171127T120000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171127T120000
DTEND:20171127T130000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Machine Learning methods for Complex Structural Variation analysisMajor Advisor: Dr. Dong-Guk ShinAssociate Advisors: Dr. Yufeng Wu, Dr. Ion MandoiuDate/Time: Monday, November 27, 2017 at 12:00pm in Babbidge 1947 meeting roomAbstract:Detecting variations larger than 50 nucleotide bases called Structural Variants (SV) with current DNA sequencing technology remains challenging in normal tissues, but becomes more problematic with the increased heterogeneity and allele complexity found in tumor tissues. We focus on three areas: (1) Multi-input SV arbitration methodThe first part of the dissertation describes a multi-input SV fusion method (FusorSV) that uses features and prior knowledge to produce a comprehensive and arbitrated call set. We show that this approach works well on deletion, duplication and inversion call types in germline data by constructing a fully automated SV calling engine (SVE) that runs eight popular calling algorithms and utilizes the freely available 1000 Genomes Phase 3 high coverage data set. By focusing on the SV type and length as features, FusorSV outperformed existing algorithms based on 1000 rounds of permutation testing and had a concordantly high in vitro validation rate in excess of 85% for novel SV events. (2) Somatic genome generation methodThe second area details a genome generator (soMaCX) that models somatic evolution from sub clonal to cancer stem cell instances under continuous control. Joint SV distributions are constructed from SV type, size, complexity and region controls. Gain and loss of function are modeled by considering the SV type and its positive or negative effect after transcription, therein providing the needed mechanism to simulate selective pressure to ONCO genes and replication regions like the NHEJ pathway. To provide user control of sample purity, reads are simulated for both normal and somatic tissues and resulting data is randomly sampled. (3) Sequence feature extraction method and applicationThe final part of this dissertation comprises a sequence feature extraction framework (SAFE) and its application to somatic SV analysis. We propose a genomic signal processor framework that abstracts and transforms sequences and alignment entries into feature vectors such as read depth, split read depth, clipped read depth, supplemental read depth, strand bias, k-mer frequency and nucleic acid proportion. Integral to this framework will be an out-of-core data structure that will offer efficient random access, normalization and indexing on large data sets. We will then prove effectiveness by application to SV allele complexity and heterogeneity using machine learning methods in conjunction with SAFE.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:93-82
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:M.S. Defense - Fanghui Liu
DTSTAMP:20171129T120000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171129T120000
DTEND:20171129T130000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Major Advisor: Laurent MichelAssociate Advisors: Alexander Russell, Benjamin FullerTitle: A Tolerant Algebraic Side-Channel Attack on AES Using CPDate: Wednesday, November 29, 2017 at 12:00 PMLocation: HBL 1947 Conference Room Abstract:AES is a mainstream block cipher used in many protocols and whose resilience against attack is essential for cybersecurity. In [1], Oren et al. discuss a Tolerant Algebraic Side-Channel Analysis (TASCA) and show how to use optimization technology to exploit side-channel information and mount a computational attack against AES. This thesis revisits the TASCA attack and the results published earlier in the conference paper [2] shows that Constraint Programming is a strong contender and a potent optimization solution. It extends bit-vector solving as introduced in [3], develops a CP and an IP model and compares them with the original Pseudo-Boolean formulation. The empirical results establish that CP can deliver solutions with orders of magnitude improvement in both run time and memory usage, traits that are essential to potential adoption by cryptographers. [1] Oren, Y., Wool, A.: Side-channel cryptographic attacks using pseudo-boolean optimization. Constraints 21(4), 616-645 (2016), http://dx.doi.org/10.1007/s10601-015-9237-3[2] Liu, F., Cruz, W., Ma, C., Johnson, G., Michel, L.: A Tolerant Algebraic Side-Channel Attack on AES Using CP, pp. 189-205. Springer International Publishing, Cham (2017), https://doi.org/10.1007/978-3-319-66158-2_13[3] Michel, L., Van Hentenryck, P.: Constraint satisfaction over bit-vectors. In: International Conference on Principles and Practice of Constraint Programming-CP 2012. pp. 527-543. Springer (2012)
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:97-85
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:PhD Proposal - Misagh Kordi
DTSTAMP:20171201T153000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171201T153000
DTEND:20171201T163000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Inferring Microbial Gene Family Evolution using Duplication-Transfer-Loss Reconciliation: Algorithms and ComplexityMajor Advisor: Dr. Mukul BansalAssociate Advisors: Dr. Ion Mandoiu, Dr. Yufeng Wu, Dr. Sheida NabaviDate/Time: Friday, December 1, 2017 at 3:30pmLocation: Homer Babbidge Library Class of 1947 Conference RoomReconstructing the evolutionary histories of genes and genomes is an important problem in evolutionary biology and fundamental to our modern understanding of biology. In microbes, gene families evolve through complex evolutionary processes such as speciation, gene duplication, horizontal gene transfer, and gene loss. In the typical formulation of this problem, called DTL reconciliation problem, the goal is to reconcile an input gene tree (gene family phylogeny) to the corresponding rooted species tree by postulating speciation, duplication, transfer, and loss events. In this dissertation proposal, we focus on one of the most important limitations of the DTL reconciliation framework, gene uncertainty, and provide new problem formulations, analyze the computational complexities of the new formulations, and provide FPT algorithms for it.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:96-84
LOCATION:
SUMMARY:PhD Proposal - Chujiao Ma
DTSTAMP:20171205T140000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171205T140000
DTEND:20171205T150000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Practicality and Application of the Algebraic Side-Channel AttackStudent: Chujiao MaMajor Advisor: Dr. John ChandyAssociate Advisors: Dr. Laurent Michel, Dr. Bing WangLocation: ITEB 401Abstract:Side-channel attacks break cryptographic algorithms by performing statistical analysis on information correlated with the secret key during encryption, such as timing, power consumption or electromagnetic waves. While it is popular due to its passive and non-invasive nature, it is vulnerable to noise in the system and measuring equipment. To reduce the limitations of the attack, it is often combined with other methods such as algebraic analysis. This proposal focuses on the practicality and application of algebraic side-channel attack (ASCA) to retrieve cryptography keys.ASCA models the cryptographic algorithm as well as the side-channel information as a set of equations that are put through a solver to solve for the secret key. The proposal first considers the application of ASCA to different algorithms. ASCA was performed with side-channel information from faults injected in the system and proven to be successful on simple algorithms such as LED as well as GOST. ASCA was also performed on TwoFish and AES with side-channel information collected from power consumptions, which are more difficult to detect.ASCA allows the attack to succeed in unknown plaintext/ciphertext scenarios and has low data complexity. While the attack can succeed for algorithms of various complexity, it is susceptible to error from the side-channel information. We attempt to mitigate the effect of error by exploiting the incomplete diffusion feature in one AES round using incomplete diffusion analytical side-channel analysis (IDASCA). In addition to different ways to exploit the data, we also explored using different solvers. While ASCA has traditionally been solved with SAT solvers, we introduce the use of a Constraint-Programming (CP) solver which allows us to have a simpler model with better error tolerance.Since ASCA is feasible against a variety of algorithms and is error-tolerant, we then examined how the structure of the algorithm (confusion, diffusion, non-linearity, complexity of operations) affects the attack, which will not only give us ideas on how to improve the attack but also on different ways countermeasures can be designed.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:94-83
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:PhD Proposal - Mahmoodreza Jahanseir
DTSTAMP:20171207T100000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171207T100000
DTEND:20171207T110000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Hierarchical Structures for High Dimensional Data AnalysisMajor Advisor: Dr. Donald SheehyAssociate Advisors: Dr. Thomas Peters, Dr. Sanguthevar RajasekaranDate/Time: Thursday, December 7, 2017 at 10:00amLocation: Homer Babbidge Library Class of 1947 Conference RoomAbstract:The volume of data is not the only problem in modern data analysis, data complexity is often more challenging. In many areas such as computational biology, topological data analysis, and machine learning, the data resides in high dimensional spaces which may not even be Euclidean. Therefore, processing such massive and complex data and extracting some useful information is a big challenge. Here, the input is a set of objects and a metric that measures the distance between pairs of objects.In this proposal, we first consider the problem of preprocessing and organizing such complex data into a hierarchical data structure that allows efficient nearest neighbor and range queries. There have been many data structures for general metric spaces, but almost all of them have construction time that can be quadratic in terms of the number of points. There are only two data structures with O(n log n) construction time but both have very complex algorithms and analyses, and they cannot be implemented efficiently. Here, we present a simple randomized incremental algorithm that builds a metric data structure in O(n log n) time in expectation. Thus, we achieve the best of both worlds, simple implementation with theoretically optimal performance.Furthermore, we consider the close relationship between our metric data structure and orderings on the points used in applications such as k-center clustering. We give linear time algorithms to go back and forth between these orderings and our metric data structure.In the last part of this proposal, we use metric data structures to extract topological features of a data set, such as the number of connected components, holes, and voids. We give a linear time algorithm for constructing a (1 + ε)-approximation to the so-called Vietoris-Rips filtration of a metric space, a fundamental tool in topological data analysis.
CATEGORIES:Colloquia
END:VEVENT
BEGIN:VEVENT
UID:98-86
LOCATION:Babbidge Library Class of 1947 Conference room
SUMMARY:PhD Proposal - Jingwen Pei
DTSTAMP:20171214T100000
ORGANIZER;CN=Howard E:MAILTO:howard.ellis@uconn.edu
DTSTART:20171214T100000
DTEND:20171214T110000
URL;VALUE=URI:http://www.cse.uconn.edu/events
DESCRIPTION:Title: Inferring the ancestry of parents and grandparents from genetic dataStudent: Jingwen PeiMajor Advisor: Dr. Yufeng WuAssociate Advisors: Dr. Ion Mandoiu, Dr. Mukul BansalDay/Time: Thursday, December 14, 2017 10:00amLocation: Babbidge 1947 Conference RoomAbstract:Inference of admixture proportions is a classical statistical problem in population genetics. Standard methods implicitly assume that both parents of an individual have the same admixture fraction. However, this is rarely the case in real data. In this project we show that the distribution and lengths of admixture tracts in a genome contains information about the admixture proportions of the ancestors of an individual. We develop a Hidden Markov Model (HMM) framework for estimating the admixture proportions of the immediate ancestors of an individual, i.e. a type of appropriation of an individual's admixture proportions into further subsets of ancestral proportions in the ancestors. Based on a genealogical model for admixture tracts, we develop an efficient algorithm for computing the sampling probability of the genome from a single individual, as a function of the admixture proportions of the ancestors of this individual. This allows us to perform probabilistic inference of admixture proportions of ancestors only using the genome of an extant individual. Extensive simulations are conducted to quantify the error in the estimation of ancestral admixture proportions under various conditions. As an illustration, we also apply the method on real data from the 1000 Genomes Project.
CATEGORIES:Colloquia
END:VEVENT
END:VCALENDAR