1st Edition

Multi-Label Dimensionality Reduction

By Liang Sun, Shuiwang Ji, Jieping Ye Copyright 2014
    208 Pages 23 B/W Illustrations
    by Chapman & Hall

    Similar to other data mining and machine learning tasks, multi-label learning suffers from dimensionality. An effective way to mitigate this problem is through dimensionality reduction, which extracts a small number of features by removing irrelevant, redundant, and noisy information. The data mining and machine learning literature currently lacks a unified treatment of multi-label dimensionality reduction that incorporates both algorithmic developments and applications.

    Addressing this shortfall, Multi-Label Dimensionality Reduction covers the methodological developments, theoretical properties, computational aspects, and applications of many multi-label dimensionality reduction algorithms. It explores numerous research questions, including:

    • How to fully exploit label correlations for effective dimensionality reduction
    • How to scale dimensionality reduction algorithms to large-scale problems
    • How to effectively combine dimensionality reduction with classification
    • How to derive sparse dimensionality reduction algorithms to enhance model interpretability
    • How to perform multi-label dimensionality reduction effectively in practical applications

    The authors emphasize their extensive work on dimensionality reduction for multi-label learning. Using a case study of Drosophila gene expression pattern image annotation, they demonstrate how to apply multi-label dimensionality reduction algorithms to solve real-world problems. A supplementary website provides a MATLAB® package for implementing popular dimensionality reduction algorithms.

    Introduction
    Introduction to Multi-Label Learning
    Applications of Multi-Label Learning
    Challenges of Multi-Label Learning
    State of the Art
    Dimensionality Reduction for Multi-Label Learning
    Overview of the Book
    Notations
    Organization

    Partial Least Squares
    Basic Models of Partial Least Squares
    Partial Least Squares Variants
    Partial Least Squares Regression
    Partial Least Squares Classification

    Canonical Correlation Analysis
    Classical Canonical Correlation
    Sparse CCA
    Relationship between CCA and Partial Least Squares
    The Generalized Eigenvalue Problem

    Hypergraph Spectral Learning
    Hypergraph Basics
    Multi-Label Learning with a Hypergraph
    A Class of Generalized Eigenvalue Problems
    The Generalized Eigenvalue Problem versus the Least Squares Problem
    Empirical Evaluation

    A Scalable Two-Stage Approach for Dimensionality Reduction
    The Two-Stage Approach with Regularization
    Empirical Evaluation

    A Shared-Subspace Learning Framework
    The Framework
    An Efficient Implementation
    Related Work
    Connections with Existing Formulations
    A Feature Space Formulation
    Empirical Evaluation

    Joint Dimensionality Reduction and Classification
    Background
    Joint Dimensionality Reduction and Multi-Label Classification
    Dimensionality Reduction with Different Input Data 
    Empirical Evaluation

    Nonlinear Dimensionality Reduction: Algorithms and Applications
    Background on Kernel Methods
    Kernel Centering and Projection
    Kernel Canonical Correlation Analysis
    Kernel Hypergraph Spectral Learning
    The Generalized Eigenvalue Problem in the Kernel-Induced Feature Space
    Kernel Least Squares Regression 
    Dimensionality Reduction and Least Squares Regression in the Feature Space
    Gene Expression Pattern Image Annotation

    Appendix: Proofs

    References

    Index

    Biography

    Liang Sun is a scientist in the R&D of Opera Solutions, a leading company in big data science and predictive analytics. He received a PhD in computer science from Arizona State University. His research interests lie broadly in the areas of data mining and machine learning. His team won second place in the KDD Cup 2012 Track 2 and fifth place in the Heritage Health Prize. In 2010, he won the ACM SIGKDD best research paper honorable mention for his work on an efficient implementation for a class of dimensionality reduction algorithms.

    Shuiwang Ji is an assistant professor of computer science at Old Dominion University. He received a PhD in computer science from Arizona State University. His research interests include machine learning, data mining, computational neuroscience, and bioinformatics. He received the Outstanding PhD Student Award from Arizona State University in 2010 and the Early Career Distinguished Research Award from Old Dominion University’s College of Sciences in 2012.

    Jieping Ye is an associate professor of computer science and engineering at Arizona State University, where he is also the associate director for big data informatics in the Center for Evolutionary Medicine and Informatics and a core faculty member of the Biodesign Institute. He received a PhD in computer science from the University of Minnesota, Twin Cities. His research interests include machine learning, data mining, and biomedical informatics. He is an associate editor of IEEE Transactions on Pattern Analysis and Machine Intelligence. He has won numerous awards from Arizona State University and was a recipient of an NSF CAREER Award. His papers have also been recognized at the International Conference on Machine Learning, KDD, and the SIAM International Conference on Data Mining (SDM).