1st Edition

Music Emotion Recognition

By Yi-Hsuan Yang, Homer H. Chen Copyright 2011
    262 Pages 74 B/W Illustrations
    by CRC Press

    Providing a complete review of existing work in music emotion developed in psychology and engineering, Music Emotion Recognition explains how to account for the subjective nature of emotion perception in the development of automatic music emotion recognition (MER) systems. Among the first publications dedicated to automatic MER, it begins with a comprehensive introduction to the essential aspects of MER—including background, key techniques, and applications.

    This ground-breaking reference examines emotion from a dimensional perspective. It defines emotions in music as points in a 2D plane in terms of two of the most fundamental emotion dimensions according to psychologists—valence and arousal. The authors present a computational framework that generalizes emotion recognition from the categorical domain to real-valued 2D space. They also:

    • Introduce novel emotion-based music retrieval and organization methods
    • Describe a ranking-base emotion annotation and model training method
    • Present methods that integrate information extracted from lyrics, chord sequence, and genre metadata for improved accuracy
    • Consider an emotion-based music retrieval system that is particularly useful for mobile devices

    The book details techniques for addressing the issues related to: the ambiguity and granularity of emotion description, heavy cognitive load of emotion annotation, subjectivity of emotion perception, and the semantic gap between low-level audio signal and high-level emotion perception. Complete with more than 360 useful references, 12 example MATLAB® codes, and a listing of key abbreviations and acronyms, this cutting-edge guide supplies the technical understanding and tools needed to develop your own automatic MER system based on the automatic recognition model.

    Introduction
    Importance of Music Emotion Recognition
    Recognizing the Perceived Emotion of Music
    Issues of Music Emotion Recognition
         Ambiguity and Granularity of Emotion Description 
         Heavy Cognitive Load of Emotion Annotation
         Subjectivity of Emotional Perception 
         Semantic Gap between Low-Level Audio Signal and High-Level Human Perception

    Overview of Emotion Description and Recognition
    Emotion Description
         Categorical Approach 
         Dimensional Approach
         Music Emotion Variation Detection
    Emotion Recognition
         Categorical Approach 
         Dimensional Approach
         Music Emotion Variation Detection

    Music Features
    Energy Features
    Rhythm Features
    Temporal Features
    Spectrum Features
    Harmony Features

    Dimensional MER by Regression
    Adopting the Dimensional Conceptualization of Emotion
    VA Prediction 
         Weighted-Sum of Component Functions 
         Fuzzy Approach 
         System Identification Approach (System ID)
    The Regression Approach
         Regression Theory 
         Problem Formulation 
         Regression Algorithms
    System Overview
    Implementation 
         Data Collection 
         Feature Extraction 
         Subjective Test
         Regressor Training
    Performance Evaluation 
         Consistency Evaluation of the Ground Truth 
         Data Transformation 
         Feature Selection
         Accuracy of Emotion Recognition 
         Performance Evaluation for Music Emotion Variation Detection  
         Performance Evaluation for Emotion Classification

    Ranking-Based Emotion Annotation and Model Training
    Motivation
    Ranking-Based Emotion Annotation
    Computational Model for Ranking Music by Emotion 
         Learning-to-Rank 
         Ranking Algorithms
    System Overview
     Implementation 
         Data Collection 
         Feature Extraction
    Performance Evaluation 
         Cognitive Load of Annotation 
         Accuracy of Emotion Recognition
         Subjective Evaluation of the Prediction Result

    Fuzzy Classification of Music Emotion 
    Motivation
    Fuzzy Classification
         Fuzzy k-NN Classifier 
         Fuzzy Nearest-Mean Classifier
    System Overview
    Implementation 
         Data Collection 
         Feature Extraction and Feature Selection
    Performance Evaluation 
         Accuracy of Emotion Classification
         Music Emotion Variation Detection

    Personalized MER and Groupwise MER
    Motivation
    Personalized MER
    Groupwise MER
    Implementation 
         Data Collection 
         Personal Information Collection 
         Feature Extraction
    Performance Evaluation 
         Performance of the General Method
         Performance of GWMER
         Performance of PMER

    Two-Layer Personalization
    Problem Formulation
    Bag-of-Users Model
    Residual Modeling and Two-Layer Personalization Scheme
    Performance Evaluation

    Probability Music Emotion Distribution Prediction
    Motivation
    Problem Formulation
    The KDE-Based Approach to Music Emotion Distribution Prediction 
         Ground Truth Collection 
          Regressor Training
         Regressor Fusion
         Output of Emotion Distribution
    Implementation 
         Data Collection 
         Feature Extraction
    Performance Evaluation 
         Comparison of Different Regression Algorithms
         Comparison of Different Distribution Modeling Methods
         Comparison of Different Feature Representations 
         Evaluation of Regressor Fusion

    Lyrics Analysis and Its Application to MER
    Motivation
    Lyrics Feature Extraction
         Uni-gram 
         Probabilistic Latent Semantic Analysis (PLSA) 
         Bi-gram 
    Multimodal MER System
    Performance Evaluation
         Comparison of Multimodal Fusion Methods
         Evaluation for PLSA Model
         Evaluation for Bi-Gram Model

    Chord Recognition and Its Application to MER
    Chord Recognition 
         Beat Tracking and PCP Extraction 
         Hidden Markov Model and N-Gram Model
         Chord Decoding 
         Chord Features
         Longest Common Chord Subsequence
         Chord Histogram
    System Overview
    Performance Evaluation 
         Evaluation of Chord Recognition System 
         Accuracy of Emotion Classification

    Genre Classification and Its Application to MER
    Motivation
    Two-Layer Music Emotion Classification
    Performance Evaluation 
         Data Collection 
         Analysis of the Correlation between Genre and Emotion
         Evaluation of the Two-Layer Emotion Classification Scheme

    Music Retrieval in the Emotion Plane
    Emotion-Based Music Retrieval
    2D Visualization of Music
    Retrieval Methods 
         Query by Emotion Point (QBEP) 
         Query by Emotion Trajectory (QBET) 
         Query by Artist and Emotion (QBAE) 
         Query by Lyrics and Emotion (QBLE)
    Implementation

    Future Research Directions
    Exploiting Vocal Timbre for MER
    Emotion Distribution Prediction Based on Rankings
    Personalized Emotion-Based Music Retrieval 
    Situational Factors of Emotion Perception 
    Connections between Dimensional and Categorical MER
    Music Retrieval and Organization in 3D Emotion Space

    Biography

    Yi-Hsuan Yang received a Ph.D. in Communication Engineering from National Taiwan University in 2010. His research interests include multimedia information retrieval, music analysis, machine learning, and affective computing. He has published over 30 technical papers in the above areas. Dr. Yang was awarded MediaTek Fellowship in 2009 and Microsoft Research Asia Fellowship in 2008.

    Homer H. Chen received a Ph.D. in Electrical and Computer Engineering from the University of Illinois at Urbana- Champaign. Since August 2003, he has been with the College of Electrical Engineering and Computer Science, National Taiwan University, where he is Irving T. Ho Chair Professor. Prior to that, he held various R&D management and engineering positions with US companies over a period of 17 years, including AT&T Bell Labs, Rockwell Science Center, iVast, and Digital Island. He was a US delegate for ISO and ITU standards committees and contributed to the development of many new interactive multimedia technologies that are now part of the MPEG-4 and JPEG-2000 standards. His professional interests lie in the broad area of multimedia signal processing and communications.

    Dr. Chen is an Associate Editor of IEEE Transactions on Circuits and Systems for Video Technology. He served as Associate Editor of IEEE Transactions on Image Processing from 1992 to 1994, Guest Editor of IEEE Transactions on Circuits and Systems for Video Technology in 1999, and an Associate Editorial of Pattern Recognition from 1989 to 1999. He is an IEEE Fellow.