PhD Proposal – Lei Li

September 17, 2018 @ 1:30 pm - 2:30 pm UTC-5

Title: An Integrated Framework for Domain, Gene and Species Reconciliation Ph.D. Candidate: Lei Li Major Advisor: Dr. Mukul Bansal Associate Advisors: Dr. Ion Mandoiu, Dr. Yufeng Wu Day/Time: Monday, September 17th, 2018 1:30 PM Location: HBL Class of 1947 Conference Room Abstract: Inferring evolutionary history for genomes plays an important role in modern biology. Genes, as functional fragments of DNA sequences, evolve inside genomes through complex mechanisms, and are often treated as the minimal evolutionary unit. However, biological studies in the recent decades showed that majority of genes in eukaryotes consist of multiple \emph{protein domains} that can be independently lost or gained during evolution. Despite the fact that a large amount of researches have been conducted in domain-related field, these works usually focus on domain architectures or use domain content information as an auxiliary to analyzing gene evolutionary history. Thus, the study on evolutionary history of domain itself is still in its infancy. And the combination of domain level and gene level evolution has never been addressed. In this proposal we develop an integrated model of domain evolution that explicitly captures the interdependence of domain-, gene-, and species-level evolution (DGS). In our DGS model domains evolve inside genes through evolutionary events such as domain co-divergence, domain duplication and domain loss, and genes evolve inside species through gene evolutionary events, typically gene duplication and gene loss for eukaryotes.\\ \indent The proposed model extends the classical phylogenetic reconciliation framework, which originally infers gene family evolution by topologically comparing a gene tree and its corresponding species tree. Our model explicitly considers domain-level evolution and decouples domain-level events from gene-level events. Under our DGS model We define the optimal DGS reconciliation problem to be seeking for a joint reconciliation scenario with minimal reconciliation cost summing over domain level and gene level evolutionary events. And we proved the corresponding computational problem is NP-hard. Considering the true history of domain evolution is unknown, the difficulty lies not only in revealing the impact of DGS model with a solution, but also evaluate the performance of the solution. To cope with the difficulty, we first proposed a polynomial time heuristic method based on dynamic programming, which is efficient to show the impact of DGS model on real biological data. Then we gave an integer linear programming algorithm to fully reveal the impact of DGS model, and justify the dynamic programming algorithm. Now we are working on the multiple domain trees version of DGS model.


HBL Class of 1947 Conference Room
UConn Library, 369 Fairfield Way, Unit 1005
Storrs, CT 06269 United States
