Dealing with the volume, complexity, and diversity of data currently being generated by scientific experiments and simulations often causes scientists to waste productive time. Scientific Data Management: Challenges, Technology, and Deployment describes cutting-edge technologies and solutions for managing and analyzing vast amounts of data, helping scientists focus on their scientific goals.
The book begins with coverage of efficient storage systems, discussing how to write and read large volumes of data without slowing the simulation, analysis, or visualization processes. It then focuses on the efficient data movement and management of storage spaces and explores emerging database systems for scientific data. The book also addresses how to best organize data for analysis purposes, how to effectively conduct searches over large datasets, how to successfully automate multistep scientific process workflows, and how to automatically collect metadata and lineage information.
This book provides a comprehensive understanding of the latest techniques for managing data during scientific exploration processes, from data generation to data analysis. Enhanced by numerous detailed color images, it includes real-world examples of applications drawn from biology, ecology, geology, climatology, and more.
Check out Dr. Shoshani discuss the book during an interview with International Science Grid This Week (iSGTW): http://www.isgtw.org/?pid=1002259
Storage Technology and Efficient Storage Access
Storage Technology, Jason Hick and John Shalf
Parallel Data Storage and Access, Robert Ross, Alok Choudhary, Garth Gibson, and Wei-keng Liao
Dynamic Storage Management, Arie Shoshani, Flavia Donno, Junmin Gu, Jason Hick, Maarten Litmaath, and Alex Sim
Data Transfer and Scheduling
Coordination of Access to Large-Scale Datasets in Distributed Environments, Tevfik Kosar, Andrei Hutanu, Jon McLaren, and Douglas Thain
High-Throughput Data Movement, Scott Klasky, Hasan Abbasi, Viraj Bhat, Ciprian Docan, Steve Hodson, Chen Jin, Jay Lofstead, Manish Parashar, Karsten Schwan, and Matthew Wolf
Specialized Retrieval Techniques and Database Systems
Accelerating Queries on Very Large Datasets, Ekow Otoo and Kesheng Wu
Emerging Database Systems in Support of Scientific Data, Per Svensson, Peter Boncz, Milena Ivanova, Martin Kersten, Niels Nes, and Doron Rotem
Data Analysis, Integration, and Visualization Methods
Scientific Data Analysis, Chandrika Kamath, Nikil Wale, George Karypis, Gaurav Pandey, Vipin Kumar, Krishna Rajan, Nagiza F. Samatova, Paul Breimyer, Guruprasad Kora, Chongle Pan, and Srikanth Yoginath
Scientific Data Management Challenges in High-Performance Visual Data Analysis, E. Wes Bethel, Prabhat, Hank Childs, Ajith Mascarenhas, and Valerio Pascucci
Interoperability and Data Integration in the Geosciences, Michael Gertz, Carlos Rueda, and Jianting Zhang
Analyzing Data Streams in Scientific Applications, Tore Risch, Samuel Madden, Hari Balakrishan, Lewis Girod, Ryan Newton, Milena Ivanova, Erik Zeitler, Johannes Gehrke, Biswanath Panda, and Mirek Riedewald
Scientific Process Management
Metadata and Provenance Management, Ewa Deelman, Bruce Berriman, Ann Chervenak, Oscar Corcho, Paul Groth, and Luc Moreau
Scientific Process Automation and Workflow Management, Bertram Ludäscher, Ilkay Altintas, Shawn Bowers, Julian Cummings, Terence Critchlow, Ewa Deelman, David De Roure, Juliana Freire, Carole Goble, Matthew Jones, Scott Klasky, Timothy McPhillips, Norbert Podhorszki, Claudio Silva, Ian Taylor, and Mladen Vouk
Conclusions and Future Outlook
Arie Shoshani is a senior staff scientist at Lawrence Berkeley National Laboratory, where he heads the Scientific Data Management Research Group. Dr. Shoshani is also the director of the Scientific Data Management Center, one of several large computer science centers supported by the SciDAC program of the U.S. Department of Energy.
Doron Rotem is a senior staff scientist at Lawrence Berkeley National Laboratory, where he heads the research program on scientific data management.
"… Each chapter contains insights and experience gleaned by experts and luminaries in storage who are confronting and managing the data tsunami that has now inundated the leading-edge scientific and supercomputing centers around the world. Individuals in a variety of scientific and commercial areas who are struggling to manage large amounts of data should find this book both educational and useful."
—Ron Farber, Scientific Computing, 2010