It is becoming increasingly apparent that all forms of communication—including voice—will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding.
Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networks
Offering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the underlying signal processing techniques used in speech coding. The authors present coding standards from various organizations, including the International Telecommunication Union (ITU). With a focus on applications such as Voice-over-IP telephony, this comprehensive text covers recent research findings on topics including:
- A general introduction to speech processing
- Digital signal processing concepts
- Sampling theory and related topics
- Principles of pulse code modulation (PCM) and adaptive differential pulse code modulation (ADPCM) standards
- Linear prediction (LP) and use of the linear predictive coding (LPC) model
- Vector quantization and its applications in speech coding
- Case studies of practical speech coders from ITU and others
- The Internet low-bit-rate coder (ILBC)
Developed from the authors’ combined teachings, this book also illustrates its contents by providing a real-time implementation of a speech coder on a digital signal processing chip. With its balance of theory and practical coverage, it is ideal for senior-level undergraduate and graduate students in electrical and computer engineering. It is also suitable for engineers and researchers designing or using speech coding systems in their work.
Introduction to Speech Coding
Speech Signals
Characteristics of Speech Signals
Modeling of Speech
Speech Analysis
Speech Coding
Varieties of Speech Coders
Measuring Speech Quality
Communication Networks and Speech Coding
Performance Issues in Speech Communication Systems
Summary of Speech Coding Standards
Fundamentals of DSP for Speech Processing
Introduction to LTI Systems
Review of Digital Signal Processing
Review of Stochastic Signal Processing
Response of a Linear System to a Stochastic Process Input
Windowing
AR Models for Speech Signals, Yule–Walker Equations
Short-Term Frequency (or Fourier) Transform and Cepstrum Periodograms
Spectral Envelope Determination for Speech Signals
Voiced/Unvoiced Classification of Speech Signals
Pitch Period Estimation Methods
Sampling Theory
Nyquist Sampling Theorem
Reconstruction of the Original Signal: Interpolation Filters
Practical Reconstruction
Aliasing and In-Band Distortion
Effect of Sampling Clock Jitter
Sampling and Reconstruction of Random Signals
Waveform Coding and Quantization
Quantization
Quantizer Performance Evaluation
Quantizer Transfer Function
Quantizer Performance under No-Overload Conditions
Uniform Quantizer
Nonuniform Quantizer
Logarithmic Companding
Segmented Companding Laws
ITU G.711 μ-Law and A-Law PCM Standards
Optimum Quantization
Adaptive Quantization
Differential Coding
Closed-Loop Differential Quantizer
Generalization to Predictive Coding
ITU G.726 ADPCM Algorithm
Linear Deltamodulation
Adaptive Deltamodulation
Linear Prediction
Properties of the Autocorrelation Matrix, R 136
Relation between Linear Prediction and AR Modeling
Augmented Wiener Hopf Equations for Forward Prediction
Backward Prediction-Error Filter
Augmented Wiener Hopf Equations for Backward Prediction
LD Recursion
Linear Predictive Coding
Linear Predictive Coding
LPC-10 Federal Standard
Introduction to CELP-Based Coders
Vector Quantization for Speech Coding Applications
Review of Scalar Quantization
Vector Quantization
Lloyd’s Algorithm for Vector Quantizer Design
The Linde–Buzo–Gray Algorithm
Popular Search Algorithms for VQ Quantizer Design
Other Suboptimal Algorithms for VQ Quantizer Design
Applications in Standards
Analysis-by-Synthesis Coding of Speech
CELP AbS Structure
Case Study Example: FS 1016 CELP Coder
Case Study Example: ITU-T G.729/729A Speech Coder
Internet Low-Bit-Rate Coder
Internet Low-Bit-Rate Codec .242
iLBC’s Encoding Process 245
iLBC’s Decoding Process 250
iLBC’s PLC Techniques 253
iLBC’s Enhancement Techniques 254
iLBC’s Synthesis and Postfiltering 257
MATLAB’s Signal Processing Blockset iLBC Demo Model
PESQ
Evolution from PSQM/PSQM + TO PESQ
PESQ Algorithm
PESQ Applications
Signal Processing in VoIP Systems
PSTN and VoIP Networks
Effect of Delay on the Perceived Speech Quality
Line ECANs
Acoustic ECANs
Jitter Buffers
Clock Skew
Packet Loss Recovery Methods
Real-Time DSP Implementation of ITU-T G.729/A Speech Coder
ITU-T G.729/A Speech Coding Standard
TI TMS320C6X DSP Processors
TI’s RF and DSP Algorithm Standard
G.729/A on RF3 on the TI C6X DSP
Running the RF3 Example on EVM
RF3 Resource Requirements
Details of Our Implementation
Migrating ITU-T G.729/A to RF3 and the EVM
Optimizing G.729/A for Real-Time Execution on the EVM
Real-Time Performance for Two Channels
Checking the Test Vectors on the EVM
Going Beyond a Two-Channel Implementation
Conclusions and Future Directions for Speech Coding
Summary
Future Directions for Speech Research
References
Index
Biography
Tokunbo Ogunfunmi is a professor in the department of electrical engineering and Director of the Signal Processing Research Lab. (SPRL) at Santa Clara University, California. His research interests include digital adaptive/nonlinear signal processing, speech and video signal processing, artificial neural networks and VLSI design. He has published two books and over 100 refereed journal and conference papers in these and related application areas. Dr. Ogunfunmi has been a consultant to industry and government and a visiting professor at Stanford University and The University of Texas. He is a Senior Member of the Institution of Electrical and Electronic Engineers (IEEE), a Member of Sigma Xi (the Scientific Research Society) , and Member of the American Association for the Advancement of Science (AAAS). He serves as the Chair of the IEEE Signal Processing Society (SPS) Santa Clara Valley Chapter and as a member of several IEEE Technical Committees (TC). He is also a registered professional engineer.
Madihally (Sim) Narasimha is currently a Senior Director of Technology at Qualcomm Inc. Prior to joining Qualcomm, he was Vice President of Technology at Ample Communications, where he directed the development of Ethernet physical layer chips. Prior to that, he served in technology leadership roles at several Voice-over-IP (VoIP) startup companies including IP Unity, Realchip Communications, and Empowertel Networks. He also held senior management positions at Symmetricom and Granger Associates (a subsidiary of DSC Communications Corporation), where he was instrumental in bringing many DSP-based telecommunications products to market. Dr. Narasimha is also a Consulting Professor in the Department of Electrical Engineering at Stanford University, Stanford, CA, where he teaches telecommunications courses and performs research in related areas.He is a Fellow of the Institution of Electrical and Electronic Engineers (IEEE).