446 Pages 75 B/W Illustrations
    by Chapman & Hall

    446 Pages 75 B/W Illustrations
    by Chapman & Hall

    Data-intensive science has the potential to transform scientific research and quickly translate scientific progress into complete solutions, policies, and economic success. But this collaborative science is still lacking the effective access and exchange of knowledge among scientists, researchers, and policy makers across a range of disciplines. Bringing together leaders from multiple scientific disciplines, Data-Intensive Science shows how a comprehensive integration of various techniques and technological advances can effectively harness the vast amount of data being generated and significantly accelerate scientific progress to address some of the world’s most challenging problems.

    In the book, a diverse cross-section of application, computer, and data scientists explores the impact of data-intensive science on current research and describes emerging technologies that will enable future scientific breakthroughs. The book identifies best practices used to tackle challenges facing data-intensive science as well as gaps in these approaches. It also focuses on the integration of data-intensive science into standard research practice, explaining how components in the data-intensive science environment need to work together to provide the necessary infrastructure for community-scale scientific collaborations.

    Organizing the material based on a high-level, data-intensive science workflow, this book provides an understanding of the scientific problems that would benefit from collaborative research, the current capabilities of data-intensive science, and the solutions to enable the next round of scientific advancements.

    What Is Data-Intensive Science?, Terence Critchlow and Kerstin Kleese van Dam

    Where Does All the Data Come From?, Geoffrey Fox, Tony Hey, and Anne Trefethen

    Data-Intensive Grand Challenge Science Problems
    Large-Scale Microscopy Imaging Analytics for In Silico Biomedicine, Joel Saltz, Fusheng Wang, George Teodoro, Lee Cooper, Patrick Widener, Jun Kong, David Gutman, Tony Pan, Sharath Cholleti, Ashish Sharma, Daniel Brat, and Tahsin Kurc

    Answering Fundamental Questions about the Universe, Eric S. Myra and F. Douglas Swesty

    Materials of the Future: From Business Suits to Space Suits, Mark F. Horstemeyer

    Case Studies
    Earth System Grid Federation: Infrastructure to Support Climate Science Analysis as an International Collaboration: A Data-Driven Activity for Extreme-Scale Climate Science, Dean N. Williams, Ian T. Foster, Bryan Lawrence, and Michael Lautenschlager

    Data-Intensive Production Grids, Bob Jones and Ian Bird 

    EUDAT: Toward a Pan-European Collaborative Data Infrastructure, D. Lecarpentier, J. Reetz, and P. Wittenburg

    From Challenges to Solutions
    Infrastructure for Data-Intensive Science: A Bottom-Up Approach, Eli Dart and William Johnston

    A Posteriori Ontology Engineering for Data-Driven Science, Damian D.G. Gessler, Cliff Joslyn, and Karin Verspoor

    Transforming Data into the Appropriate Context, Bill Howe

    Bridging the Gap between Scientific Data Producers and Consumers: A Provenance Approach, Eric G. Stephan, Paulo Pinheiro, and Kerstin Kleese van Dam

    In Situ Exploratory Data Analysis for Scientific Discovery, Kanchana Padmanabhan, Sriram Lakshminarasimhan,  Zhenhuan Gong, John Jenkins, Neil Shah, Eric Schendel, Isha Arkatkar, Rob Ross, Scott Klasky, and Nagiza F. Samatova

    Interactive Data Exploration, Brian Summa, Attilay Gyulassy, Peer-Timo Bremer, and Valerio Pascucci

    Linked Science: Interconnecting Scientific Assets, Tomi Kauppinen, Alkyoni Baglatzi, and Carsten Keßler

    Summary and Conclusions, Terence Critchlow and Kerstin Kleese van Dam

    Index

    Biography

    Terence Critchlow is the chief scientist of the Scientific Data Management Group in the Computational Sciences and Mathematics Division of the Pacific Northwest National Laboratory (PNNL), where he leads projects on data analysis, data dissemination, and workflow system. A senior member of IEEE and ACM, Dr. Critchlow earned a PhD in computer science from the University of Utah. His research focuses on large-scale data management, metadata, data analysis, online analytical processing, data integration, data dissemination, and scientific workflows.

    Kerstin Kleese van Dam is an associate division director and lead of the Scientific Data Management Group at PNNL. In 2006, she received the British Female Innovators and Inventors Silver Award for the effective management of scientific data. Her research focuses on data management and analysis in extreme-scale environments.

    "This nicely integrated collection of contributions is an attempt to familiarize readers with this challenging aspect of science in the 21st century. The editors draw a picture of the future of scientific data production along the lines of the grand challenges identified by the National Academy of Engineering. ... This book is elegantly written, and intended for decision-makers. ... It achieves a good balance between technical and strategic thinking. This makes it a good choice for scientific decision-makers such as directors of institutes and universities, who are in fact in a position to shape the future networking structures for global data management."
    --Hamid R. Noori, Computing Reviews