PEDS Advance Access originally published online on April 14, 2008
Protein Engineering Design and Selection 2008 21(6):379-386; doi:10.1093/protein/gzn015
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data
1Biomedical and Health Informatics Program 2Department of Bioengineering, University of Washington, Box 355013, Seattle, WA 98195-5013, USA
3 To whom correspondence should be addressed. E-mail: daggett{at}u.washington.edu
The Dynameomics project is our effort to characterize the native-state dynamics and folding/unfolding pathways of representatives of all known protein folds by way of molecular dynamics simulations, as described by Beck et al. (in Protein Eng. Des. Select., the first paper in this series). The data produced by these simulations are highly multidimensional in structure and multi-terabytes in size. Both of these features present significant challenges for storage, retrieval and analysis. For optimal data modeling and flexibility, we needed a platform that supported both multidimensional indices and hierarchical relationships between related types of data and that could be integrated within our data warehouse, as described in the accompanying paper directly preceding this one. For these reasons, we have chosen On-line Analytical Processing (OLAP), a multi-dimensional analysis optimized database, as an analytical platform for these data. OLAP is a mature technology in the financial sector, but it has not been used extensively for scientific analysis. Our project is further more unusual for its focus on the multidimensional and analytical capabilities of OLAP rather than its aggregation capacities. The dimensional data model and hierarchies are very flexible. The query language is concise for complex analysis and rapid data retrieval. OLAP shows great promise for the dynamic protein analysis for bioengineering and biomedical applications. In addition, OLAP may have similar potential for other scientific and engineering applications involving large and complex datasets.
Keywords: data warehouse/Dynameomics/molecular dynamics/protein dynamics/OLAP
Received February 23, 2008; revised February 23, 2008; accepted February 25, 2008.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. A. C. Beck, D. O. V. Alonso, D. Inoyama, and V. Daggett The intrinsic conformational propensities of the 20 naturally occurring amino acids and reflection of these propensities in proteins PNAS, August 26, 2008; 105(34): 12259 - 12264. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Simms, R. D. Toofanny, C. Kehl, N. C. Benson, and V. Daggett Dynameomics: design of a computational lab workflow and scientific data repository for protein simulations Protein Eng. Des. Sel., June 1, 2008; 21(6): 369 - 377. [Abstract] [Full Text] [PDF] |
||||

