1st Edition

Principles of Speech Coding

    381 Pages 186 B/W Illustrations
    by CRC Press

    It is becoming increasingly apparent that all forms of communication—including voice—will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding.

    Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networks

    Offering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the underlying signal processing techniques used in speech coding. The authors present coding standards from various organizations, including the International Telecommunication Union (ITU). With a focus on applications such as Voice-over-IP telephony, this comprehensive text covers recent research findings on topics including:

    • A general introduction to speech processing
    • Digital signal processing concepts
    • Sampling theory and related topics
    • Principles of pulse code modulation (PCM) and adaptive differential pulse code modulation (ADPCM) standards
    • Linear prediction (LP) and use of the linear predictive coding (LPC) model
    • Vector quantization and its applications in speech coding
    • Case studies of practical speech coders from ITU and others
    • The Internet low-bit-rate coder (ILBC)

    Developed from the authors’ combined teachings, this book also illustrates its contents by providing a real-time implementation of a speech coder on a digital signal processing chip. With its balance of theory and practical coverage, it is ideal for senior-level undergraduate and graduate students in electrical and computer engineering. It is also suitable for engineers and researchers designing or using speech coding systems in their work.

    Introduction to Speech Coding
    Speech Signals
    Characteristics of Speech Signals
    Modeling of Speech
    Speech Analysis
    Speech Coding
    Varieties of Speech Coders
    Measuring Speech Quality
    Communication Networks and Speech Coding
    Performance Issues in Speech Communication Systems
    Summary of Speech Coding Standards

    Fundamentals of DSP for Speech Processing
    Introduction to LTI Systems
    Review of Digital Signal Processing
    Review of Stochastic Signal Processing
    Response of a Linear System to a Stochastic Process Input
    Windowing
    AR Models for Speech Signals, Yule–Walker Equations
    Short-Term Frequency (or Fourier) Transform and Cepstrum Periodograms
    Spectral Envelope Determination for Speech Signals
    Voiced/Unvoiced Classification of Speech Signals
    Pitch Period Estimation Methods

    Sampling Theory
    Nyquist Sampling Theorem
    Reconstruction of the Original Signal: Interpolation Filters
    Practical Reconstruction
    Aliasing and In-Band Distortion
    Effect of Sampling Clock Jitter
    Sampling and Reconstruction of Random Signals

    Waveform Coding and Quantization
    Quantization
    Quantizer Performance Evaluation
    Quantizer Transfer Function
    Quantizer Performance under No-Overload Conditions
    Uniform Quantizer
    Nonuniform Quantizer
    Logarithmic Companding
    Segmented Companding Laws
    ITU G.711 μ-Law and A-Law PCM Standards
    Optimum Quantization
    Adaptive Quantization

    Differential Coding
    Closed-Loop Differential Quantizer
    Generalization to Predictive Coding
    ITU G.726 ADPCM Algorithm
    Linear Deltamodulation
    Adaptive Deltamodulation

    Linear Prediction
    Properties of the Autocorrelation Matrix, R 136
    Relation between Linear Prediction and AR Modeling
    Augmented Wiener Hopf Equations for Forward Prediction
    Backward Prediction-Error Filter
    Augmented Wiener Hopf Equations for Backward Prediction
    LD Recursion

    Linear Predictive Coding
    Linear Predictive Coding
    LPC-10 Federal Standard
    Introduction to CELP-Based Coders

    Vector Quantization for Speech Coding Applications
    Review of Scalar Quantization
    Vector Quantization
    Lloyd’s Algorithm for Vector Quantizer Design
    The Linde–Buzo–Gray Algorithm
    Popular Search Algorithms for VQ Quantizer Design
    Other Suboptimal Algorithms for VQ Quantizer Design
    Applications in Standards

    Analysis-by-Synthesis Coding of Speech
    CELP AbS Structure
    Case Study Example: FS 1016 CELP Coder
    Case Study Example: ITU-T G.729/729A Speech Coder

    Internet Low-Bit-Rate Coder
    Internet Low-Bit-Rate Codec .242
    iLBC’s Encoding Process 245
    iLBC’s Decoding Process 250
    iLBC’s PLC Techniques 253
    iLBC’s Enhancement Techniques 254
    iLBC’s Synthesis and Postfiltering 257
    MATLAB’s Signal Processing Blockset iLBC Demo Model
    PESQ
    Evolution from PSQM/PSQM + TO PESQ
    PESQ Algorithm
    PESQ Applications

    Signal Processing in VoIP Systems
    PSTN and VoIP Networks
    Effect of Delay on the Perceived Speech Quality
    Line ECANs
    Acoustic ECANs
    Jitter Buffers
    Clock Skew
    Packet Loss Recovery Methods

    Real-Time DSP Implementation of ITU-T G.729/A Speech Coder
    ITU-T G.729/A Speech Coding Standard
    TI TMS320C6X DSP Processors
    TI’s RF and DSP Algorithm Standard
    G.729/A on RF3 on the TI C6X DSP
    Running the RF3 Example on EVM
    RF3 Resource Requirements
    Details of Our Implementation
    Migrating ITU-T G.729/A to RF3 and the EVM
    Optimizing G.729/A for Real-Time Execution on the EVM
    Real-Time Performance for Two Channels
    Checking the Test Vectors on the EVM
    Going Beyond a Two-Channel Implementation

    Conclusions and Future Directions for Speech Coding
    Summary
    Future Directions for Speech Research

    References

    Index

    Biography

    Tokunbo Ogunfunmi is a professor in the department of electrical engineering and Director of the Signal Processing Research Lab. (SPRL) at Santa Clara University, California. His research interests include digital adaptive/nonlinear signal processing, speech and video signal processing, artificial neural networks and VLSI design. He has published two books and over 100 refereed journal and conference papers in these and related application areas. Dr. Ogunfunmi has been a consultant to industry and government and a visiting professor at Stanford University and The University of Texas. He is a Senior Member of the Institution of Electrical and Electronic Engineers (IEEE), a Member of Sigma Xi (the Scientific Research Society) , and Member of the American Association for the Advancement of Science (AAAS). He serves as the Chair of the IEEE Signal Processing Society (SPS) Santa Clara Valley Chapter and as a member of several IEEE Technical Committees (TC). He is also a registered professional engineer.

    Madihally (Sim) Narasimha is currently a Senior Director of Technology at Qualcomm Inc. Prior to joining Qualcomm, he was Vice President of Technology at Ample Communications, where he directed the development of Ethernet physical layer chips. Prior to that, he served in technology leadership roles at several Voice-over-IP (VoIP) startup companies including IP Unity, Realchip Communications, and Empowertel Networks. He also held senior management positions at Symmetricom and Granger Associates (a subsidiary of DSC Communications Corporation), where he was instrumental in bringing many DSP-based telecommunications products to market. Dr. Narasimha is also a Consulting Professor in the Department of Electrical Engineering at Stanford University, Stanford, CA, where he teaches telecommunications courses and performs research in related areas.He is a Fellow of the Institution of Electrical and Electronic Engineers (IEEE).