neds.gif (1190 bytes)

New England Database Society

Friday, April 28, 2006

sponsored by Sun Microsystems

sunlogo.gif (4979 bytes)

NEDS

Data Management Technology for Scientific Applications

Anastassia Ailamaki
 Carnegie Mellon University

Friday, May 5, 2006, 4:00 PM
Volen 101, Brandeis University

(preceded by a wine and cheese reception at 3:00 pm)

Abstract:

Scientific applications are becoming increasingly data intensive. Due to advances in instrumentation, experimental infrastructures and simulation capabilities, scientific disciplines are faced with the challenge of storing and processing unprecedented volumes of data. Observation-based datasets are mostly bound by database physical design issues, whereas simulation-generated data management needs computational support and specialized indexing. Unfortunately, nowadays database systems do not provide adequate tools for automated physical design or appropriate indexing methods for scientific datasets.

This talk summarizes current results from our ongoing efforts to design low-overhead, high-impact database support for large-scale scientific applications. The first part addresses query execution performance and manageability of observation-based datasets (such as astronomical data) stored in conventional relational database systems. We present AutoPart, an algorithm for automatically partitioning database tables based on a representative query workload. Our experiments with real astronomy data demonstrate that workload-aware partitioning improves query performance without incurring storage or update performance overheads. The second part of the talk addresses data generated from scientific simulations, such as earthquake simulations, which produce and process 3-D structures represented by large unstructured tetrahedral meshes. Unfortunately, conventional spatial indexing techniques are inadequate to efficiently analyze and visualize this data. We present Directed Local Search (DLS), an efficient indexing and query processing technique for unstructured tetrahedral meshes, which significantly improves performance when running queries commonly used in scientific applications.

Speaker Bio:

Anastassia (Natassa) Ailamaki received a B.Sc. degree in Computer Engineering from the Polytechnic School of the University of Patra, Greece, M.Sc. degrees from the Technical University of Crete, Greece and from the University of Rochester, NY, and a Ph.D. degree in Computer Science from the University of Wisconsin-Madison. In 2001, she joined the Computer Science Department at Carnegie Mellon University as an Assistant Professor.  Her research interests are in the broad area of database systems and applications, with emphasis on database system behavior on modern processor hardware and disks. Her projects at Carnegie Mellon (including Staged Database Systems, Cache-Resident Data Bases, and the Fates Storage Manager) aim at building systems to strengthen the interaction between the database software and the underlying hardware and I/O devices. In addition, she is working on automated schema design and computational database support for scientific applications, storage device modeling and performance prediction, as well as internet query caching.

Natassa has received a Sloan Research Fellowship (2005), six best-paper awards (VLDB 2001, Performance 2002, VLDB PhD Workshop 2003, ICDE 2004, FAST 2005, and ICDE 2006 (demo)), an NSF CAREER award (2002), and IBM Faculty Partnership awards in 2001, 2002, and 2003. She is a member of IEEE and ACM, and has also been a CRA-W mentor.


Maintained by Dina Goldin dqg AT cse.uconn.edu