Focuses on singular value decomposition, semidiscrete decomposition, independent component analysis, non-negative matrix factorization, and tensors Matches the proper matrix analysis technique to real-world scientific and engineering systemsDiscusses several important theoretical and algorithmic problems of matrix decompositions, such as instabilityApplies matrix decompositions to the diverse fields of information retrieval, topic detection, geochemistry, astrophysics, microarray analysis, process control, counterterrorism, and social network analysisProvides MATLAB® scripts to generate examples of matrix decompositions and URLs where tools can be downloaded
Making obscure knowledge about matrix decompositions widely available, Understanding Complex Datasets: Data Mining with Matrix Decompositions discusses the most common matrix decompositions and shows how they can be used to analyze large datasets in a broad range of application areas. Without having to understand every mathematical detail, the book helps you determine which matrix is appropriate for your dataset and what the results mean.
Explaining the effectiveness of matrices as data analysis tools, the book illustrates the ability of matrix decompositions to provide more powerful analyses and to produce cleaner data than more mainstream techniques. The author explores the deep connections between matrix decompositions and structures within graphs, relating the PageRank algorithm of Google's search engine to singular value decomposition. He also covers dimensionality reduction, collaborative filtering, clustering, and spectral analysis. With numerous figures and examples, the book shows how matrix decompositions can be used to find documents on the Internet, look for deeply buried mineral deposits without drilling, explore the structure of proteins, detect suspicious emails or cell phone calls, and more.
Concentrating on data mining mechanics and applications, this resource helps you model large, complex datasets and investigate connections between standard data mining techniques and matrix decompositions.
Table of Contents
What Is Data Like?
Data Mining Techniques
Why Use Matrix Decompositions?
SINGULAR VALUE DECOMPOSITION (SVD)
Interpreting an SVD
Applications of SVD
Graphs versus Datasets
Eigenvalues and Eigenvectors
Connections to SVD
Overview of the Embedding Process
Datasets versus Graphs
The ATHENS System for Novel Knowledge Discovery
SEMIDISCRETE DECOMPOSITION (SDD)
Interpreting an SDD
Applying an SDD
USING SVD AND SDD TOGETHER
SVD Then SDD
Applications of SVD and SDD Together
INDEPENDENT COMPONENT ANALYSIS (ICA)
Interpreting an ICA
Applying an ICA
Applications of ICA
NON-NEGATIVE MATRIX FACTORIZATION (NNMF)
Interpreting an NNMF
Applying an NNMF
Applications of NNMF
The Tucker3 Tensor Decomposition
The CP Decomposition
Applications of Tensor Decompositions
APPENDIX: MATLAB SCRIPTS
… One of this book’s attractive features is that every chapter contains a discussion relating to the algorithmic issues. One scenario is used as a running illustrative example throughout the book. Several other examples are discussed in different chapters. These examples should help the reader understand the advantages as well as the practical problems associated with any of the proposed matrix-based data mining techniques covered in the book. I recommend this book for anyone interested in using matrix methods for data mining.
—Technometrics, February 2009, Vol. 51, No. 1
This could be a nice companion book for courses in data mining or applied linear algebra. Producing a clear taxonomy of the use and intentions of matrix decompositions in data analysis is very useful to both students and researchers. … Those working with large-scale complex datasets will definitely find this work useful. … I would definitely use it in my own course in data mining.
—Michael W. Berry, University of Tennessee, Knoxville, USA
[This book] is suffused with insightful suggestions for analytical methods and interpretations, drawn from the author's own research and his reading of the literature. …The book has two great strengths. The first is its attempt to provide a unifying framework from which to view a host of important analytical methodologies based on matrix methods. … Second, the book is extremely strong on interpreting the results of matrix methods. … [It] assembles and explains a diverse set of insights that are otherwise widely scattered in the literature. This alone makes the book an important contribution to the community.
—Bruce Hendrickson, Sandia National Laboratories, Albuquerque, New Mexico, USA