Scientific Data Management: Challenges, Technology, and Deployment

Free Standard Shipping

Purchasing Options

ISBN 9781420069808
Cat# C6980



SAVE 20%

eBook (VitalSource)
ISBN 9781420069815
Cat# CE6980



SAVE 30%

eBook Rentals

Other eBook Options:


  • Examines the problems encountered when managing and analyzing large and complex data volumes
  • Presents techniques for managing data in different phases of scientific exploration
  • Discusses efficient access to storage systems, including parallel file systems
  • Explores how to efficiently move large data volumes and automatically optimize the physical organization of data for fast analysis
  • Includes specialized data search methods, feature discovery, and statistical analysis methods
  • Describes technology for scientific workflow, metadata, and provenance management that ensures executing multiple tasks in sequence and/or concurrently in a robust, tractable, and recoverable manner


Dealing with the volume, complexity, and diversity of data currently being generated by scientific experiments and simulations often causes scientists to waste productive time. Scientific Data Management: Challenges, Technology, and Deployment describes cutting-edge technologies and solutions for managing and analyzing vast amounts of data, helping scientists focus on their scientific goals.

The book begins with coverage of efficient storage systems, discussing how to write and read large volumes of data without slowing the simulation, analysis, or visualization processes. It then focuses on the efficient data movement and management of storage spaces and explores emerging database systems for scientific data. The book also addresses how to best organize data for analysis purposes, how to effectively conduct searches over large datasets, how to successfully automate multistep scientific process workflows, and how to automatically collect metadata and lineage information.

This book provides a comprehensive understanding of the latest techniques for managing data during scientific exploration processes, from data generation to data analysis. Enhanced by numerous detailed color images, it includes real-world examples of applications drawn from biology, ecology, geology, climatology, and more.

Check out Dr. Shoshani discuss the book during an interview with International Science Grid This Week (iSGTW):

Table of Contents

Storage Technology and Efficient Storage Access

Storage Technology, Jason Hick and John Shalf

Parallel Data Storage and Access, Robert Ross, Alok Choudhary, Garth Gibson, and Wei-keng Liao

Dynamic Storage Management, Arie Shoshani, Flavia Donno, Junmin Gu, Jason Hick, Maarten Litmaath, and Alex Sim

Data Transfer and Scheduling

Coordination of Access to Large-Scale Datasets in Distributed Environments, Tevfik Kosar, Andrei Hutanu, Jon McLaren, and Douglas Thain

High-Throughput Data Movement, Scott Klasky, Hasan Abbasi, Viraj Bhat, Ciprian Docan, Steve Hodson, Chen Jin, Jay Lofstead, Manish Parashar, Karsten Schwan, and Matthew Wolf

Specialized Retrieval Techniques and Database Systems

Accelerating Queries on Very Large Datasets, Ekow Otoo and Kesheng Wu

Emerging Database Systems in Support of Scientific Data, Per Svensson, Peter Boncz, Milena Ivanova, Martin Kersten, Niels Nes, and Doron Rotem

Data Analysis, Integration, and Visualization Methods

Scientific Data Analysis, Chandrika Kamath, Nikil Wale, George Karypis, Gaurav Pandey, Vipin Kumar, Krishna Rajan, Nagiza F. Samatova, Paul Breimyer, Guruprasad Kora, Chongle Pan, and Srikanth Yoginath

Scientific Data Management Challenges in High-Performance Visual Data Analysis, E. Wes Bethel, Prabhat, Hank Childs, Ajith Mascarenhas, and Valerio Pascucci

Interoperability and Data Integration in the Geosciences, Michael Gertz, Carlos Rueda, and Jianting Zhang

Analyzing Data Streams in Scientific Applications, Tore Risch, Samuel Madden, Hari Balakrishan, Lewis Girod, Ryan Newton, Milena Ivanova, Erik Zeitler, Johannes Gehrke, Biswanath Panda, and Mirek Riedewald

Scientific Process Management

Metadata and Provenance Management, Ewa Deelman, Bruce Berriman, Ann Chervenak, Oscar Corcho, Paul Groth, and Luc Moreau

Scientific Process Automation and Workflow Management, Bertram Ludäscher, Ilkay Altintas, Shawn Bowers, Julian Cummings, Terence Critchlow, Ewa Deelman, David De Roure, Juliana Freire, Carole Goble, Matthew Jones, Scott Klasky, Timothy McPhillips, Norbert Podhorszki, Claudio Silva, Ian Taylor, and Mladen Vouk

Conclusions and Future Outlook


Editor Bio(s)

Arie Shoshani is a senior staff scientist at Lawrence Berkeley National Laboratory, where he heads the Scientific Data Management Research Group. Dr. Shoshani is also the director of the Scientific Data Management Center, one of several large computer science centers supported by the SciDAC program of the U.S. Department of Energy.

Doron Rotem is a senior staff scientist at Lawrence Berkeley National Laboratory, where he heads the research program on scientific data management.

Editorial Reviews

"… Each chapter contains insights and experience gleaned by experts and luminaries in storage who are confronting and managing the data tsunami that has now inundated the leading-edge scientific and supercomputing centers around the world. Individuals in a variety of scientific and commercial areas who are struggling to manage large amounts of data should find this book both educational and useful."
—Ron Farber, Scientific Computing, 2010