2nd Edition

Speech Enhancement Theory and Practice, Second Edition

By Philipos C. Loizou Copyright 2013
    716 Pages 207 B/W Illustrations
    by CRC Press

    716 Pages 207 B/W Illustrations
    by CRC Press

    With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at improving speech intelligibility.

    Fundamentals, Algorithms, Evaluation, and Future Steps

    Organized into four parts, the book begins with a review of the fundamentals needed to understand and design better speech enhancement algorithms. The second part describes all the major enhancement algorithms and, because these require an estimate of the noise spectrum, also covers noise estimation algorithms. The third part of the book looks at the measures used to assess the performance, in terms of speech quality and intelligibility, of speech enhancement methods. It also evaluates and compares several of the algorithms. The fourth part presents binary mask algorithms for improving speech intelligibility under ideal conditions. In addition, it suggests steps that can be taken to realize the full potential of these algorithms under realistic conditions.

    What’s New in This Edition

    • Updates in every chapter
    • A new chapter on objective speech intelligibility measures
    • A new chapter on algorithms for improving speech intelligibility
    • Real-world noise recordings (on downloadable resources)
    • MATLAB® code for the implementation of intelligibility measures (on downloadable resources)
    • MATLAB and C/C++ code for the implementation of algorithms to improve speech intelligibility (on downloadable resources)

    Valuable Insights from a Pioneer in Speech Enhancement

    Clear and concise, this book explores how human listeners compensate for acoustic noise in noisy environments. Written by a pioneer in speech enhancement and noise reduction in cochlear implants, it is an essential resource for anyone who wants to implement or incorporate the latest speech enhancement algorithms to improve the quality and intelligibility of speech degraded by noise.

    Includes downloadable resources with Code and Recordings

    The downloadable resources provide MATLAB implementations of representative speech enhancement algorithms as well as speech and noise databases for the evaluation of enhancement algorithms.

    Introduction
    Understanding the Enemy: Noise
    Classes of Speech Enhancement Algorithms
    Book Organization
    References

    Part I Fundamentals

    Discrete-Time Signal Processing and Short-Time Fourier Analysis
    Discrete-Time Signals
    Linear Time-Invariant Discrete-Time Systems
    z-Transform
    Discrete-Time Fourier Transform
    Short-Time Fourier Transform
    Spectrographic Analysis of Speech Signals
    Summary
    References

    Speech Production and Perception
    Speech Signal
    Speech Production Process
    Engineering Model of Speech Production
    Classes of Speech Sounds
    Acoustic Cues in Speech Perception
    Summary
    References

    Noise Compensation by Human Listeners
    Intelligibility of Speech in Multiple-Talker Conditions
    Acoustic Properties of Speech Contributing to Robustness
    Perceptual Strategies for Listening in Noise
    Summary
    References

    Part II Algorithms

    Spectral-Subtractive Algorithms
    Basic Principles of Spectral Subtraction
    Geometric View of Spectral Subtraction
    Shortcomings of the Spectral Subtraction Method
    Spectral Subtraction Using Oversubtraction
    Nonlinear Spectral Subtraction
    Multiband Spectral Subtraction
    MMSE Spectral Subtraction Algorithm
    Extended Spectral Subtraction
    Spectral Subtraction Using Adaptive Gain Averaging
    Selective Spectral Subtraction
    Spectral Subtraction Based on Perceptual Properties
    Performance of Spectral Subtraction Algorithms
    Summary
    References

    Wiener Filtering
    Introduction to Wiener Filter Theory
    Wiener Filters in the Time Domain
    Wiener Filters in the Frequency Domain
    Wiener Filters and Linear Prediction
    Wiener Filters for Noise Reduction
    Iterative Wiener Filtering
    Imposing Constraints on Iterative Wiener Filtering
    Constrained Iterative Wiener Filtering
    Constrained Wiener Filtering
    Estimating the Wiener Gain Function
    Incorporating Psychoacoustic Constraints in Wiener Filtering
    Codebook-Driven Wiener Filtering
    Audible Noise Suppression Algorithm
    Summary
    References

    Statistical-Model-Based Methods
    Maximum-Likelihood Estimators
    Bayesian Estimators
    MMSE Estimator
    Improvements to the Decision-Directed Approach
    Implementation and Evaluation of the MMSE Estimator
    Elimination of Musical Noise
    Log-MMSE Estimator
    MMSE Estimation of the pth-Power Spectrum
    MMSE Estimators Based on Non-Gaussian Distributions
    Maximum A Posteriori (Map) Estimators
    General Bayesian Estimators
    Perceptually Motivated Bayesian Estimators
    Incorporating Speech Absence Probability in Speech Enhancement
    Methods for Estimating the A Priori Probability of Speech Absence
    Summary
    References

    Subspace Algorithms
    Introduction
    Using SVD for Noise Reduction: Theory
    SVD-Based Algorithms: White Noise
    SVD-Based Algorithms: Colored Noise
    SVD-Based Methods: A Unified View
    EVD-Based Methods: White Noise
    EVD-Based Methods: Colored Noise
    EVD-Based Methods: A Unified View
    Perceptually Motivated Subspace Algorithms
    Subspace-Tracking Algorithms
    Summary
    References

    Noise-Estimation Algorithms
    Voice Activity Detection vs. Noise Estimation
    Introduction to Noise-Estimation Algorithms
    Minimal-Tracking Algorithms
    Time-Recursive Averaging Algorithms for Noise Estimation
    Histogram-Based Techniques
    Other Noise-Estimation Algorithms
    Objective Comparison of Noise-Estimation Algorithms
    Summary
    References

    Part III Evaluation

    Evaluating Performance of Speech Enhancement Algorithms
    Quality vs. Intelligibility
    Evaluating Intelligibility of Processed Speech
    Evaluating Quality of Processed Speech
    Evaluating Reliability of Quality Judgments: Recommended Practice
    Summary
    References

    Objective Quality and Intelligibility Measures
    Objective Quality Measures
    Evaluation of Objective Quality Measures
    Quality Measures: Summary of Findings and Future Directions
    Speech Intelligibility Measures
    Evaluation of Intelligibility Measures
    Intelligibility Measures: Summary of Findings and Future Directions
    Summary
    References

    Comparison of Speech Enhancement Algorithms
    NOIZEUS: A Noisy Speech Corpus for Quality Evaluation of Speech Enhancement Algorithms
    Comparison of Enhancement Algorithms: Speech Quality
    Comparison of Enhancement Algorithms: Speech Intelligibility
    Summary
    References

    Part IV Future Steps

    Algorithms That Can Improve Speech Intelligibility
    Reasons for the Absence of Intelligibility Improvement with Existing Noise-Reduction Algorithms
    Algorithms Based on Channel Selection: A Different Paradigm for Noise Reduction
    Channel-Selection Criteria
    Intelligibility Evaluation of Channel-Selection-Based Algorithms: Ideal Conditions
    Implementation of Channel-Selection-Based Algorithms in Realistic Conditions
    Evaluating Binary Mask Estimation Algorithms
    Channel Selection and Auditory Scene Analysis
    Summary
    References

    Appendices
    Appendix A: Special Functions and Integrals
    Appendix B: Derivation of the MMSE Estimator
    Appendix C: MATLAB® Code and Speech/Noise Databases

    Index

    Biography

    Philipos C. Loizou earned his bachelor’s, master’s, and doctorate degrees in electrical engineering from Arizona State University in Tempe. A pioneer in the field of speech enhancement and noise reduction in cochlear implants, Dr. Loizou was one of the first to develop specific enhancement algorithms that directly improve intelligibility. He was a postdoctoral fellow in the Department of Speech and Hearing Science at Arizona State University, an assistant professor at the University of Arkansas in Little Rock, and Cecil and Ida Green Professor in the Department of Electrical Engineering at the University of Texas at Dallas. Dr. Loizou was a fellow of the Acoustical Society of America. He was an associate editor of the International Journal of Audiology (2010–2012), IEEE Transactions on Biomedical Engineering (2009–2011), IEEE Transactions on Speech and Audio Processing (1999–2002), and IEEE Signal Processing Letters (2006–2009) and a member of the Speech Technical Committee (2008–2010) of the IEEE Signal Processing Society. He authored or coauthored numerous publications, including three textbooks.

    For more information, see Dr. Loizou’s profile at the University of Texas at Dallas.

    Watch a video of Dr. Loizou talking about technology that would allow cochlear implant users to easily adjust settings on their hearing devices through a smartphone.

    "… indispensable for anyone trying to further his or her understanding in the field of digital speech processing. This book is critical in helping the professional address the growing demand to design algorithms that can improve speech intelligibility, in the presence of noise, without sacrificing quality for hearing aids and cochlear implants and address the equally important growing need to design rooms in which we can hear better naturally. Loizou's clarity of presentation of the mathematical foundation of different algorithms for speech enhancement has a comprehensibility that can only come with the level of expertise possessed by Loizou and serves well for all professionals from acoustical engineers to audiologists. This book provides an exceptional foundation and insight into past, present and future innovative processing techniques. This book is valuable for students and professionals of all experience levels."
    —Bonnie Schnitta, SoundSense, LLC, Wainscott, New York, USA, from Noise Control Engineering Journal, January-February 2015

    "... by far the most comprehensive treatment of speech enhancement available. All the most important techniques in the broad field of speech enhancement are covered, yet the author at the same time manages to treat each topic in great detail. ... The algorithms are complex, but Loizou's exposition is outstanding. The second edition brings the material right up to date, covering recent significant breakthroughs in binary masking algorithms ... . One of the great strengths of this text is the availability of code, allowing readers to better understand, deploy, and extend existing algorithms for speech enhancement. ... This volume is in reality far more than a textbook on speech enhancement. It is also one of the most important works on the effect of noise on speech perception, and as such will make a huge contribution to the education of the next generation of auditory scientists, and feed technological developments in all aspects of speech communication, particularly for individuals with hearing impairment."
    —Prof. Martin Cooke, Ikerbasque and University of the Basque Country, Vitoria, Spain

    "This textbook offers outstanding reference material for teaching the clinical application of spectral enhancement to the audiology community. Dr. Loizou offers the reader tremendous insight into the fundamentals of digital signal processing, speech production and perception, and the characteristics of various noise sources. ... The textbook is essential for engineers, audiologists, and other professionals who seek to improve the listener’s ability to hear a target signal against a background filled with competing noise using the spectral enhancement technique."
    —Amyn M. Amlani, Ph.D., University of North Texas, Denton, USA

    "... a highly informative presentation of the fundamentals, seminal and current algorithms, evaluation metrics, and future work that is desirable for any new or experienced students and researchers in the exciting area of speech enhancement. ... I greatly appreciate the excellent organization of dividing the book into Fundamentals, Algorithms, Evaluation, and Future Steps, which can allow instructors and researchers to quickly decide on the material they want to teach their students, or learn or review themselves ... Dr. Loizou takes students and researchers with a range of experiences on an amazing journey through the exciting field of speech enhancement."
    —Marek Trawicki, Marquette University, Wauwatosa, Wisconsin, USA

    "The first edition of this book established itself as the best reference for single-channel speech enhancement. Amazingly, this new edition is even better, and could be the most authoritative work in the area of modern single-channel techniques for speech enhancement to date. … This is a unique book, combining both thorough theoretical developments and practical implementations. I highly recommend it to those interested in speech enhancement, as well as applied signal processing."
    Association of Computing Machinery (ACM) Computing Reviews, July 2013
    Reviewer: Vladimir Botchev, Analog Devices, Wilmington, Massachusetts, USA