Computer Science and Engineering Graphic ITEB Link    
University of Connecticut Logo
About Computer Science and Engineering
Line
Computer Science and Engineering Undergrad
Line
Computer Science and Engineering Graduate Programs
Line
Computer Science and Engineering Research Programs
Line
Computer Science and Engineering Faculty Information
Line
Computer Science and Engineering Job Opportunities
Line
Computer Science and Engineering News
Line
Computer Science and Engineering Contact Information
Line
School of Engineering Website
Line
University of Connecticut Main Page
Line
Computer Science and Engineering Site Map
Line

Computer Science & 
Engineering Department 
371 Fairfield Road 
Unit 2155 
Storrs, CT 06269-2155 
Phone: (860) 486-3719 
Fax: (860) 486-4817 



Colloquia, Seminars and Conference News

Title : Active Learning with Hidden Factor Models in Collaborative Prediction and Systems Management

Date : March 15, 2007. (2:00 pm) Tea starts half an hour before each seminar

Location: ITEB 336

Speaker : Dr. Irina Rish

Abstract:

Various tasks arising in management of complex distributed compter systems and networks, such as problem diagnosis and resource allocation, require fast real-time inferences based on available systems measurements, and a smart choice of such measurements can greatly improve both the quality and speed of inference and decision-making. For example, accurately estimating end-to-end transaction performance is essential both for monitoring compliance with service-level agreements (SLAs) and for performance optimization (e.g., choosing the highest-bandwidth server for a download request in a content-distribution system). However, exhaustive pairwise measurements of end-to-end performance is infeasible in large systems, and cannot be kept up-to-date in highly dynamic environments. Thus, a natural alternative is to predict unobserved end-to-end performances from available historic data, with a minimal amount of additional "active" measurements. In this talk, I will present our recent work on active sampling in collaborative predictions, with applications to end-to-end performance prediction and best server selection in content-distribution systems. Collaborative prediction is a problem of predicting unobserved entries in sparsely observed matrices, e.g. product ratings by different users in online recommender systems, or historic data on the connection quality (e.g., bandwidth) between nodes in a network. However, the quality of prediction may be quite sensitive to the choice of available samples, which motivates active sampling approaches. In this work, we suggest an active sampling method based on the recently proposed Maximum-Margin Matrix Factorization (MMMF), a linear factor model that was shown to outperform state-of-art collaborative prediction techniques. MMMF is formulated as a semi-definite program (SDP) that finds a low-norm (rather than traditional low-rank) matrix factorization, and is also closely related to learning max-margin linear discriminants (SVMs). This relation to SVMs inspires several margin-based active sampling heuristics that allow for an exploration-exploitation. trade-off on top of MMMF factor models and demonstrate excellent empirical results, saving hundreds of samples in order to achieve desired predictive accuracy, in a variety of practical domains, including both traditional recommender systems and systems-management applications. If time permits, I will also discuss our prior work on network fault diagnosis, i.e. recovering most-likely states of unobserved system components given the outcomes of end-to-end test transactions, called probes. Our focus here is on achieving good trade-offs between the diagnostic accuracy versus the cost of testing and computational complexity of diagnosis. results characterizing these trade-offs, such as lower bound on the number of probes necessary to active online approach to selecting most-informative tests, as well as (3) approximation techniques using ''loopy'' belief propagation for handling intractable inference problems involved in both diagnosis and results on real applications demonstrate the advantages of active that greatly reduces the number of probes (up to 75%) and the time needed to diagnose problems.

Bio:Research Gubkin of probabilistic inference, machine learning, and information Particularly, she has been working on approximate inference in graphical models, information-theoretic experiment and their applications to the area of management of complex distributed prediction and online include dimensionality reduction and feature selection in bioinformatics and neuroscience (analysis of MRI data). She is also an adjunct professor at Columbia University where she taught machine-learning courses at EE and CS Departments.

[Back]