All seminars are held in the Kennedy room. Morning
seminars are 1 session,
Seminar 1: Data Management in Location-Dependent Information
Services
(Tuesday March 30, morning)
Location-dependent information services (LDISs) answer queries in accordance with the locations the queries are associated with (e.g., the locations from which the queries are issued. The emergence of LDISs is resulted from the convergence of high-speed wireless networks, personal portable devices, and locatining techniques. LDISs have a variety of promising applications, such as local information access (e.g., traffic reports, news, and navigation maps) and nearest neighbor queries (e.g., finding the nearest restaurant), and are expected to become an integral part of our daily life.
This seminar will provide background and an overview of research on location-dependent information access in mobile and pervasive environments. In particular, it will discuss the following topic areas:
1. Positioning technologies;
2. Moving objects tracking;
3. Location-dependent query processing;
4. Location-dependent cache management;
5. System integration;
6. Privacy and security.
Biographies:
Wang-Chien
Lee is an associate professor of computer science and engineering at
Baihua
Zheng is an assistant professor in the
Jianliang
Xu is an assistant professor in the Department of
Computer Science at
Seminar 2: 'My Personal Web': A Tutorial on Personalization and
Privacy for Web and Converged Services
(Tuesday
March 30, afternoon)
The web services paradigm holds the promise of tremendous
flexibility in how services are combined to meet the needs of individual
end-users. The ``convergence'' of
networks (wireline telephony, wireless, data) further
enhances the web services paradigm, by enabling the incorporation of real time
contextual information (e.g., presence and location) along with opportunities
for web services to impact the physical world more immediately (e.g., a vending
machine delivering a soda based on a purchase via a cell phone). But it will not be possible for most end-users
to enjoy the rich and intricate possibilities, unless
a broad variety of personalization technologies are available and respect the
end user's legitimate need for privacy.
This tutorial begins with examples illustrating why
personalization will be so important for the emerging web and converged
services. The main body of the tutorial
focuses on three inter-related technologies:
1. profile data management, the ability for services to share and access end-user profile data (including address, credit card, ``simple'' preferences, current location, current presence, ...) as appropriate for the services to be provided.
2.
preference and policy management, the ability to store and execute
on intricate, interrelated preferences that end-users may have (e.g., ``during
working hours, calls from strangers should be routed to voice-mail''; ``I
usually work from 9 to 6, but on Thursdays it is from 8 to 4''; ...).
3.
personalized
and privacy-conscious data sharing of profile data and preferences, the notion
that an end-user should have complete control over what profile and preference
data is shared with whom and under what circumstances and how it is
interpreted.
In addition to describing emerging approaches for
providing these capabilities, the tutorial will describe how to add value to
applications by using personalization, from both the end-user and the
application provider perspectives.
Biographies:
Arnaud Sahuguet is a Member of the Technical Staff in the Network Data
and Services Department at Bell Laboratories, Lucent Technologies. He received
his Ph.D. in computer science at the
Irini Fundulaki holds a Post Doc position at the
Network Data and Services Department at Bell Laboratories, Lucent Technologies.
She received her Ph.D. in computer science from the Conservatoire National des
Arts et M\'etiers in
Seminar 3: Similarity Search in Multimedia Databases
(Wednesday March
31, morning)
There are
many practical applications that benefit from multimedia databases, e.g.,
molecular biology, medicine, CAD/CAM, and geography. An important research
issue in the field of multimedia databases is the content-based retrieval of
similar objects. Given a multimedia query object, the search for an exact match
in a database is not meaningful in most applications, because the probability
that two multimedia objects are identical is negligible (unless they are
digital copies from the same source). For this reason, the development of efficient
and effective similarity search techniques has become an important topic in the
multimedia database research community.
The goal
of this advanced technology seminar is to provide an overview of the similarity
search problem and to present the state-of-art techniques for performing
efficient and effective similarity queries in multimedia databases. The seminar
begins with an introduction and a motivation of multimedia databases. The two
main approaches for describing multimedia objects (as elements in a metric
space or in a vector space) are introduced, as well as a description of the
"Multimedia Content Description Interface" (MPEG)-7 standard. The efficiency issue is addressed for both metric
and vector space approaches, describing the data structures and algorithms used
to answer similarity queries. For the effectiveness issue, the seminar
introduces some widely used retrieval performance measures. Several examples of
techniques for particular multimedia applications (text, image, CAD, 3D
objects, audio and video) are presented.
Biographies:
Daniel
A. Keim is working in the area of multimedia databases, image
similarity search and high-dimensional indexing structures. He has published
extensively on multimedia databases and data mining, and he has given tutorials
on related issues at several large conferences including SIGMOD, VLDB, ICDE and
KDD; he has been program co-chair of the KDD conference in 2002 and of the IEEE
Information Visualization Symposia in 1999 and 2000; and he is editor of IEEE Trans.
on Knowledge and Data Engineering, IEEE Trans. on Visualization and Computer
Graphics, and Palgrave's Information Visualization
Journal. He received his PhD in Computer Science from the
Benjamin
Bustos is working in the area of multimedia databases and
similarity search in high-dimensional spaces. He received an MSc degree in Computer Science in 2002 from the
Seminar 4: XML Query Processing
(Wednesday March
31, afternoon)
XQuery is starting to gain significant traction as a language for querying and transforming XML data. It is used in a variety of different products. Examples to date include XML database systems, XML document repositories, XML data integation, workflow systems, and publish and subscribe systems. In addition, XPath of which XQuery is a superset is used in various products such as Web browsers. Although the W3C XQuery specification has not yet attained recommendation status, and the definition of the language has not entirely stabilized, a number of alternative proposals to implement and optimize XQuery have appeared both in industry and in the research community. Given the wide range of applications for which XQuery is applicable, a wide spectrum of alternative techniques have been proposed for XQuery processing. Some of these techniques are only useful for certain applications, other techniques are general-purpose.
The goal of this tutorial is to give an overview of the
existing approaches to process XQuery expressions and
to give details of the most important techniques. The presenters have experience from designing
and building an industrial-strength XQuery engine
[1]. The tutorial will give details of
that XQuery engine, but the tutorial will also give
extensive coverage of other XQuery engines and of the
state of the art in the research community.
1. Introduction to XQuery
- Motivation
- XQuery data
model
- XQuery type
system
- Basic query language concepts
2. Internal Representation of XML Data
- DOM
- SAX Events
- TokenStream
- Skeleton
- Vertical Partitioning
3. XQuery Algebras
- XQuery Core
vs. Relational Algebra
- XQuery
Algebras from Research Projects
4. XPath Query
Processing
- Transducers, Automata, etc.
5. XQuery Optimization
- XML query equivalence
- Rewrite Rules
- Cost Models
6. XQuery Runtime
Systems
- Iterator
Models
- Algorithms for XQuery
Operators
7. XML Indexes
- Value and path indexes, other
8. XQuery Products and
Prototypes
- XQRL/BEA, Galax, Saxon, etc. (as
available)
9. Advanced Query Processing Techniques, Related
Topics
- Querying compressed XML data
- Multi-Query Optimization
- Publish&Subscribe
and XML Information Filter
- XML Data Integration
- XML Updates
- XML integrity constraints
10. Summary
Biographies:
Daniela Florescu is a Senior
Software Engineer in BEA Systems. She received her MS in Mathematics in 1990
from the
Donald Kossmann is a Full Professor for Computer Science at the
Seminar 5:
(Thursday March
1, morning)
By meta data management, we mean techniques for manipulating schemas and schema-like objects (such as interface definitions and web site maps) and mappings between them. Many popular research problems in the past five years are primarily meta data problems, such as data warehouse tools (e.g., ETL -- to extract, transform and load), data integration, the semantic web, generation of XML or object-oriented wrappers for SQL databases, and generation of wrappers for web sites. Other classical meta data problems are information resource management, design tool support and integration, and schema evolution and data migration.
Despite its longevity and continued importance, there is no widely accepted conceptual framework for the meta data field, as there is for many other database topics, such as access methods, query processing, and transaction management. In this tutorial, we propose such a conceptual framework. It consists of three layers: applications, design patterns, and basic operators. Applications are the end-user problems to be solved, like those listed in the previous paragraph. Design patterns are generic problems that need to be solved in support of many different applications, such as meta modeling (for all meta data problems), answering queries using views (for data integration and the semantic web), and change propagation (for data translation, schema evolution, and round-trip engineering). Basic operators are procedures that are needed to support multiple design patterns and applications, such as matching schemas to produce a mapping, merging schemas based on a mapping, and composing mappings.
We will describe several meta data management problems. For each problem, we will explain which design patterns and operators are needed to solve it. We will summarize the main approaches to each design pattern and operator -- the main choices of language, data structures, and algorithms -- and will highlight the relevant papers that address it.
This tutorial is targeted at both practicing engineers and researchers. The former will learn about the latest solutions to important meta data problems and the many difficult unsolved problems that are best to avoid. Database researchers, especially professors, will benefit from considering the conceptual framework that we propose, since no database textbooks treat meta data management as a separate topic as far as we know.
Biographies:
Philip A. Bernstein is a
researcher at Microsoft Corporation. Over the past 25 years, he has been a
product architect and industrial researcher at Microsoft and at Digital
Equipment Corp., a professor at
Sergey Melnik
is a Ph.D. candidate in Computer Science at
Seminar 6: Implementation and Research Issues in Query
Processing for Wireless Sensor Networks
(Thursday
April 1, afternoon)
This is a three-hour tutorial discussing the design
and implementation of software systems as well as open research problems
related to data processing and collection in wireless sensor networks. During
the first hour-and-a-half, we focus on the design of the TinyDB
data collection system for networks of
Biographies:
Wei Hong is a
senior researcher at Intel Research, Berkeley.
His current research focuses on data management in sensor networks. He leads the Tiny Application Sensor Kit
(TASK) project at Intel Research and co-designed/developed TinyDB,
an open-source, in-network sensor database system with Samuel Madden. Prior to
joining Intel Research, Wei co-founded and
architected the products of two startup companies: Illustra
Information Technology Inc. and Cohera Corp. Illustra developed the first successful commercial
Object-Relational database system. It was acquired by Informix, now part of
IBM. Cohera provided electronic catalog management
solutions based on a novel federated database system that it developed. Its
technology was acquired by PeopleSoft. Wei earned a Ph.D. in computer science from UC Berkeley and
holds a master and two bachelor degrees from
Samuel Madden is an Assistant Professor in
the Department of Electrical Engineering and Computer Sciences and a member of
the Computer Sciences and Artificial Intelligence Laboratory at MIT. He
received his Ph.D. in Computer Science from the
Seminar 7: Data Mining for Intrusion Detection: Techniques,
Applications and Systems
(Friday April 2,
morning)
An intrusion is defined as any set of actions that
compromise the integrity, confidentiality or availability of a resource.
Intrusion detection is an important task for information infrastructure
security. One major challenge in intrusion detection is that we have to
identify the camouflaged intrusions from a huge amount of normal communication
activities. Data mining is to identify valid, novel, potentially useful, and
ultimately understandable patterns in massive data. It is demanding to apply
data mining techniques to detect various intrusions.
In the last several years, some exciting and important
advances have been made in intrusion detection using data mining techniques. Research
results have been published and some prototype systems have been established.
Inspired by the huge demands from applications, the interactions and
collaborations between the communities of security and data mining have been
boosted substantially.
This seminar will present an interdisciplinary survey of
data mining techniques for intrusion detection so that the researchers from
computer security and data mining communities can share the experiences and
learn from each other. Some data mining based intrusion detection systems will
also be reviewed briefly. Moreover, research challenges and problems will be
discussed so that future collaborations may be stimulated. For data
mining/database researchers and practitioners, the seminar will provide background
knowledge and opportunities for applying data mining techniques to intrusion
detection and computer security. For
computer security researchers and practitioners, it provides knowledge on how
data mining can benefit and enhance computer security. We will try to
understand and appreciate the following technical issues.
1. What is intrusion detection? Why
is it challenging and why data mining techniques can really help?
2. What are the major data mining
techniques available for intrusion detection?
3. Successful applications of data
mining techniques in intrusion detection and the experiences.
Biographies:
Shambhu J. Upadhyaya
is an Associate
Professor of Computer Science and Engineering at the State University of New
York at
Faisal Farooq
received the
B.Eng. in Computer Science from National Institute of Technology,
Venugopal Govindaraju
is a professor of
Computer Science and Engineering at State University of New York at
Photo by Jim Steinhart, courtesy of PlanetWare™ Inc., all rights reserved.
Maintained by Dina Goldin <dqg AT cse.uconn.edu>;
last updated on