March 30, 2020 –
Title: Computational Approaches for Horizontal Gene Transfer Classification by Scale and Type
PhD Candidate: Lina Kloub
Major Advisor: Dr. Mukul Bansal
Associate Advisors: Dr. Ion Mandoiu, Dr. Derek Aguiar, Dr. J. Peter Gogarten
Date/Time: Monday, March 30, 2020 10:30 AM
Location: HBL Class of 1947 Conference Room
The transfer of genetic information between organisms that are not in a direct ancestor-descendant relationship, called Horizontal Gene Transfer (HGT), is a crucial process in microbial evolution. However, little is known about how horizontally transferred genes are integrated into recipient genomes or about the "scale" of individual HGT events. In this work, we develop new computational frameworks to answer two fundamental properties of HGT events. The first property concerns the units of HGT events. An HGT event may involve the transfer of a gene fragment, an entire gene, or multiple genes and very little is currently known about the units of HGT events. The second property deals with the mode of any HGT event. In particular, when a gene is horizontally transferred, it may either add itself as a new gene to the recipient genome or replace an existing homologous gene. Currently, studies do not usually distinguish between "additive" and "replacing" HGTs, and their specific role and impact on microbial evolution is poorly understood.
Here, we build upon recent computational advances in the detection of HGTs and leverage recent large-scale availability of microbial genomic datasets to develop new computational frameworks to study these two fundamental properties of HGT events. Our first method infers single-gene HGT events across the given set of species, uses several techniques to account for inference uncertainty, combines that information with gene order information, and uses statistical analysis to identify candidate horizontal multi-gene transfers (HMGTs). Our second method uses the discovered HGTs and HMGTs, considers the gene neighborhoods around these genes, and uses statistical analysis to classify the HGTs and HMGTs as additive or replacing. We apply both methods to a genome-scale dataset of over 22,000 gene families from 103 Aeromonas genomes. Our first method identifies a large number of plausible HMGTs of various scales at both small and large phylogenetic distances, and reveals interesting relationships between gene function, phylogenetic distance, and frequency of multi-gene transfer. The second method classifies a subset of the inferred HGTs and HMGTs as being either additive or replacing with high confidence, and, again, identifies relationships between gene function, phylogenetic distance, and HGT/HMGT type.
In future work, we seek to leverage recently developed, but error-prone, reconciliation-based methods for classifying HGTs by type (additive or replacing) to improve the classification performance of our second method, and to test two specific biological hypotheses relating phylogenetic distances and HGT types.