Abstract:
Bayesian nonparametric graphical models provide a formal mechanism for encoding probabilistic assumptions about the data generation process where the dimension of the latent space is unknown a priori or may grow with additional samples. A common limitation of these models is that posterior inference is computationally intensive particularly for nonconjugate models or when integrating over combinatorial structures. In this talk, I will introduce hierarchical Bayesian nonparametric models for sequential data clustering using fragmentation coagulation processes and admixture modelling of expressed gene transcripts. Both models are highly interpretable and flexibly adapt the number of clusters or mixture components to the data. I will demonstrate practical algorithms that cast statistical inference as optimization highlighting difficulties associated with nonconjugacy and inference over the space of graphs. To conclude, I will present future directions in probabilistic machine learning focusing on practicality and uncertainty in Bayesian nonparametric models.
Bio:
Derek Aguiar is a postdoctoral research associate in the Computer Science Department at Princeton University. He earned a B.S. in Computer Science and Computer Engineering from the University of Rhode Island and Ph.D. in Computer Science from Brown University. His current research explores probabilistic machine learning and scalable inference methods with applications in genetics and genomics.