|
New England Database
Society sponsored by Sun Microsystems |
| |
|
NEDS |
Data Management Technology for Scientific Applications
Anastassia Ailamaki
Carnegie Mellon University
Friday, May 5,
2006, 4:00 PM
Volen 101,
Brandeis University
(preceded by a wine and cheese reception at 3:00 pm)
Abstract:
Scientific applications are becoming increasingly data
intensive. Due to advances in instrumentation, experimental infrastructures and
simulation capabilities, scientific disciplines are faced with the challenge of
storing and processing unprecedented volumes of data. Observation-based datasets
are mostly bound by database physical design issues, whereas
simulation-generated data management needs computational support and specialized
indexing. Unfortunately, nowadays database systems do not provide adequate tools
for automated physical design or appropriate indexing methods for scientific
datasets.
This talk summarizes current results from our ongoing efforts to design
low-overhead, high-impact database support for large-scale scientific
applications. The first part addresses query execution performance and
manageability of observation-based datasets (such as astronomical data) stored
in conventional relational database systems. We present AutoPart, an algorithm
for automatically partitioning database tables based on a representative query
workload. Our experiments with real astronomy data demonstrate that
workload-aware partitioning improves query performance without incurring storage
or update performance overheads. The second part of the talk addresses data
generated from scientific simulations, such as earthquake simulations, which
produce and process 3-D structures represented by large unstructured tetrahedral
meshes. Unfortunately, conventional spatial indexing techniques are inadequate
to efficiently analyze and visualize this data. We present Directed Local Search
(DLS), an efficient indexing and query processing technique for unstructured
tetrahedral meshes, which significantly improves performance when running
queries commonly used in scientific applications.
Speaker Bio:
Anastassia (Natassa) Ailamaki received a B.Sc. degree in Computer Engineering
from the Polytechnic School of the University of Patra, Greece, M.Sc. degrees
from the Technical University of Crete, Greece and from the University of
Rochester, NY, and a Ph.D. degree in Computer Science from the University of
Wisconsin-Madison. In 2001, she joined the Computer Science Department at
Carnegie Mellon University as an Assistant Professor. Her research
interests are in the broad area of database systems and applications, with
emphasis on database system behavior on modern processor hardware and disks. Her
projects at Carnegie Mellon (including Staged Database Systems, Cache-Resident
Data Bases, and the Fates Storage Manager) aim at building systems to strengthen
the interaction between the database software and the underlying hardware and
I/O devices. In addition, she is working on automated schema design and
computational database support for scientific applications, storage device
modeling and performance prediction, as well as internet query caching.
Natassa has received a Sloan Research Fellowship (2005), six best-paper awards (VLDB
2001, Performance 2002, VLDB PhD Workshop 2003, ICDE 2004, FAST 2005, and ICDE
2006 (demo)), an NSF CAREER award (2002), and IBM Faculty Partnership awards in
2001, 2002, and 2003. She is a member of IEEE and ACM, and has also been a CRA-W
mentor.
Maintained by Dina Goldin dqg AT cse.uconn.edu