Details

Digital Speech Transmission


Digital Speech Transmission

Enhancement, Coding and Error Concealment
1. Aufl.

von: Peter Vary, Rainer Martin

107,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 04.08.2006
ISBN/EAN: 9780470031759
Sprache: englisch
Anzahl Seiten: 648

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

The enormous advances in digital signal processing (DSP) technology have contributed to the wide dissemination and success of speech communication devices - be it GSM and UMTS mobile telephones, digital hearing aids, or human-machine interfaces. Digital speech transmission techniques play an important role in these applications, all the more because high quality speech transmission remains essential in all current and next generation communication networks.<br> <br> Enhancement, coding and error concealment techniques improve the transmitted speech signal at all stages of the transmission chain, from the acoustic front-end to the sound reproduction at the receiver. Advanced speech processing algorithms help to mitigate a number of physical and technological limitations such as background noise, bandwidth restrictions, shortage of radio frequencies, and transmission errors.<br> <br> Digital Speech Transmission provides a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology. The authors give a solid, accessible overview of<br> * fundamentals of speech signal processing<br> * speech coding, including new speech coders for GSM and UMTS<br> * error concealment by soft decoding<br> * artificial bandwidth extension of speech signals<br> * single and multi-channel noise reduction<br> * acoustic echo cancellation<br> <br> This text is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.
<p>Preface xv</p> <p><b>1 Introduction 1</b></p> <p><b>2 Models of Speech Production and Hearing 5</b></p> <p>2.1 Organs of Speech Production 6</p> <p>2.2 Characteristics of Speech Signals 8</p> <p>2.3 Model of Speech Production 10</p> <p>2.3.1 Acoustic Tube Model of the Vocal Tract 11</p> <p>2.3.2 Digital All-Pole Model of the Vocal Tract 19</p> <p>2.4 Anatomy of Hearing 25</p> <p>2.5 Psychoacoustic Properties of the Auditory Organ 28</p> <p>2.5.1 Hearing and Loudness 28</p> <p>2.5.2 Spectral Resolution 30</p> <p>2.5.3 Masking 32</p> <p>Bibliography 33</p> <p><b>3 Spectral Transformations 35</b></p> <p>3.1 Fourier Transform of Continuous Signals 35</p> <p>3.2 Fourier Transform of Discrete Signals 37</p> <p>3.3 Linear Shift Invariant Systems 39</p> <p>3.3.1 Frequency Response of LSI Systems 41</p> <p>3.4 The <i>z</i> -transform 41</p> <p>3.4.1 Relation to FT 43</p> <p>3.4.2 Properties of the ROC 44</p> <p>3.4.3 Inverse <i>z</i> -transform 44</p> <p>3.4.4 <i>z</i> -transform Analysis of LSI Systems 46</p> <p>3.5 The Discrete Fourier Transform 47</p> <p>3.5.1 Linear and Cyclic Convolution 50</p> <p>3.5.2 The DFT of Windowed Sequences 52</p> <p>3.5.3 Spectral Resolution and Zero Padding 55</p> <p>3.5.4 Fast Computation of the DFT: The FFT 56</p> <p>3.5.5 Radix-2 Decimation-in-Time FFT 57</p> <p>3.6 Fast Convolution 61</p> <p>3.6.1 Fast Convolution of Long Sequences 61</p> <p>3.6.2 Fast Convolution by Overlap-Add 61</p> <p>3.6.3 Fast Convolution by Overlap-Save 62</p> <p>3.7 Cepstral Analysis 65</p> <p>3.7.1 Complex Cepstrum 65</p> <p>3.7.2 Real Cepstrum 66</p> <p>3.7.3 Applications of the Cepstrum 67</p> <p>Bibliography 70</p> <p><b>4 Filter Banks for Spectral Analysis and Synthesis 73</b></p> <p>4.1 Spectral Analysis Using Narrowband Filters 73</p> <p>4.1.1 Short-Term Spectral Analyzer 78</p> <p>4.1.2 Prototype Filter Design for the Analysis Filter Bank 82</p> <p>4.1.3 Short-Term Spectral Synthesizer 84</p> <p>4.1.4 Short-Term Spectral Analysis and Synthesis 86</p> <p>4.1.5 Prototype Filter Design for the Analysis–Synthesis Filter Bank 88</p> <p>4.1.6 Filter Bank Interpretation of the DFT 90</p> <p>4.2 Polyphase Network Filter Banks 93</p> <p>4.2.1 PPN Analysis Filter Bank 93</p> <p>4.2.2 PPN Synthesis Filter Bank 101</p> <p>4.3 Quadrature Mirror Filter Banks 105</p> <p>4.3.1 Analysis–Synthesis Filter Bank 105</p> <p>4.3.2 Compensation of Aliasing and Signal Reconstruction 107</p> <p>4.3.3 Efficient Implementation 111</p> <p>Bibliography 115</p> <p><b>5 Stochastic Signals and Estimation 119</b></p> <p>5.1 Basic Concepts 119</p> <p>5.1.1 Random Events and Probability 119</p> <p>5.1.2 Conditional Probabilities 121</p> <p>5.1.3 Random Variables 121</p> <p>5.1.4 Probability Distributions and Probability Density Functions 122</p> <p>5.1.5 Conditional PDFs 123</p> <p>5.2 Expectations and Moments 124</p> <p>5.2.1 Conditional Expectations and Moments 125</p> <p>5.2.2 Examples 125</p> <p>5.2.3 Transformation of a Random Variable 128</p> <p>5.2.4 Relative Frequencies and Histograms 129</p> <p>5.3 Bivariate Statistics 130</p> <p>5.3.1 Marginal Densities 130</p> <p>5.3.2 Expectations and Moments 130</p> <p>5.3.3 Uncorrelatedness and Statistical Independence 131</p> <p>5.3.4 Examples of Bivariate PDFs 132</p> <p>5.3.5 Functions of Two Random Variables 133</p> <p>5.4 Probability and Information 135</p> <p>5.4.1 Entropy 135</p> <p>5.4.2 Kullback–Leibler Divergence 135</p> <p>5.4.3 Mutual Information 136</p> <p>5.5 Multivariate Statistics 136</p> <p>5.5.1 Multivariate Gaussian Distribution 137</p> <p>5.5.2 <i>χ</i><sup>2</sup> -distribution 137</p> <p>5.6 Stochastic Processes 138</p> <p>5.6.1 Stationary Processes 138</p> <p>5.6.2 Auto-correlation and Auto-covariance Functions 139</p> <p>5.6.3 Cross-correlation and Cross-covariance Functions 140</p> <p>5.6.4 Multivariate Stochastic Processes 140</p> <p>5.7 Estimation of Statistical Quantities by Time Averages 142</p> <p>5.7.1 Ergodic Processes 142</p> <p>5.7.2 Short-Time Stationary Processes 143</p> <p>5.8 Power Spectral Densities 144</p> <p>5.8.1 White Noise 145</p> <p>5.9 Estimation of the Power Spectral Density 145</p> <p>5.9.1 The Periodogram 145</p> <p>5.9.2 Smoothed Periodograms 147</p> <p>5.10 Statistical Properties of Speech Signals 147</p> <p>5.11 Statistical Properties of DFT Coefficients 148</p> <p>5.11.1 Asymptotic Statistical Properties 149</p> <p>5.11.2 Signal-plus-Noise Model 150</p> <p>5.11.3 Statistical Properties of DFT Coefficients for Finite Frame Lengths 152</p> <p>5.12 Optimal Estimation 154</p> <p>5.12.1 MMSE Estimation 155</p> <p>5.12.2 Optimal Linear Estimator 156</p> <p>5.12.3 The Gaussian Case 157</p> <p>5.12.4 Joint Detection and Estimation 158</p> <p>Bibliography 160</p> <p><b>6 Linear Prediction 163</b></p> <p>6.1 Vocal Tract Models and Short-Term Prediction 164</p> <p>6.2 Optimal Prediction Coefficients for Stationary Signals 171</p> <p>6.2.1 Optimum Prediction 171</p> <p>6.2.2 Spectral Flatness Measure 174</p> <p>6.3 Predictor Adaptation 177</p> <p>6.3.1 Block-Oriented Adaptation 177</p> <p>6.3.2 Sequential Adaptation 188</p> <p>6.4 Long-Term Prediction 192</p> <p>Bibliography 198</p> <p><b>7 Quantization 201</b></p> <p>7.1 Analog Samples and Digital Representation 201</p> <p>7.2 Uniform Quantization 203</p> <p>7.3 Non-uniform Quantization 211</p> <p>7.4 Optimal Quantization 221</p> <p>7.5 Adaptive Quantization 222</p> <p>7.6 Vector Quantization 228</p> <p>7.6.1 Principle 228</p> <p>7.6.2 The Complexity Problem 230</p> <p>7.6.3 Lattice Quantization 231</p> <p>7.6.4 Design of Optimal Vector Code Books 232</p> <p>7.6.5 Gain–Shape Vector Quantization 236</p> <p>Bibliography 237</p> <p><b>8 Speech Coding 239</b></p> <p>8.1 Classification of Speech Coding Algorithms 240</p> <p>8.2 Model-Based Predictive Coding 243</p> <p>8.3 Differential Waveform Coding 245</p> <p>8.3.1 First-Order DPCM 245</p> <p>8.3.2 Open-Loop and Closed-Loop Prediction 249</p> <p>8.3.3 Quantization of the Residual Signal 250</p> <p>8.3.4 Adaptive Differential Pulse Code Modulation 260</p> <p>8.4 Parametric Coding 262</p> <p>8.4.1 Vocoder Structures 262</p> <p>8.4.2 LPC Vocoder 265</p> <p>8.4.3 Quantization of the Predictor Coefficients 266</p> <p>8.5 Hybrid Coding 273</p> <p>8.5.1 Basic Codec Concepts 273</p> <p>8.5.2 Residual Signal Coding: RELP 282</p> <p>8.5.3 Analysis by Synthesis: CELP 290</p> <p>8.5.4 Analysis by Synthesis: MPE, RPE 301</p> <p>8.6 Adaptive Postfiltering 305</p> <p>Bibliography 309</p> <p><b>9 Error Concealment and Soft Decision Source Decoding 315</b></p> <p>9.1 Hard Decision Source Decoding 316</p> <p>9.2 Conventional Error Concealment 317</p> <p>9.3 Softbits and <i>L</i>-values 321</p> <p>9.3.1 Binary Symmetric Channel (BSC) 321</p> <p>9.3.2 Fading–AWGN Channel 329</p> <p>9.3.3 Channel with Inner SISO Decoding 335</p> <p>9.4 Soft Decision (SD) Source Decoding 336</p> <p>9.4.1 Parameter Estimation 338</p> <p>9.4.2 The <i>A Posteriori</i> Probabilities 340</p> <p>9.5 Application to Model Parameters 345</p> <p>9.5.1 Soft Decision Decoding without Channel Coding 346</p> <p>9.5.2 Soft Decision Decoding with Channel Coding 348</p> <p>9.6 Further Improvements 353</p> <p>Bibliography 355</p> <p><b>10 Bandwidth Extension (BWE) of Speech Signals 361</b></p> <p>10.1 Narrowband versus Wideband Telephony 362</p> <p>10.2 Speech Coding with Integrated BWE 366</p> <p>10.3 BWE without Auxiliary Transmission 369</p> <p>10.3.1 Basic Approaches and Classification 369</p> <p>10.3.2 Spectral Envelope Estimation 372</p> <p>10.3.3 Extension of the Excitation Signal 375</p> <p>10.3.4 Example BWE Algorithm 377</p> <p>Bibliography 382</p> <p><b>11 Single and Dual Channel Noise Reduction 389</b></p> <p>11.1 Introduction 390</p> <p>11.2 Linear MMSE Estimators 392</p> <p>11.2.1 Non-causal IIR Wiener filter 392</p> <p>11.2.2 The FIR Wiener Filter 395</p> <p>11.3 Speech Enhancement in the DFT Domain 396</p> <p>11.3.1 The Wiener Filter Revisited 398</p> <p>11.3.2 Spectral Subtraction 400</p> <p>11.3.3 Estimation of the <i>APriori</i>SNR 402</p> <p>11.3.4 Musical Noise and Countermeasures 403</p> <p>11.3.5 Aspects of Spectral Analysis/Synthesis 408</p> <p>11.4 Optimal Non-linear Estimators 411</p> <p>11.4.1 Maximum Likelihood Estimation 412</p> <p>11.4.2 Maximum <i>A Posteriori</i> Estimation 414</p> <p>11.4.3 MMSE Estimation 414</p> <p>11.4.4 MMSE Estimation of Functions of the Spectral Magnitude 416</p> <p>11.5 Joint Optimum Detection and Estimation of Speech 419</p> <p>11.6 Computation of Likelihood Ratios 422</p> <p>11.7 Estimation of the <i>APriori</i>Probability of Speech Presence 423</p> <p>11.7.1 A Hard-Decision Estimator Based on Conditional Probabilities 423</p> <p>11.7.2 Soft-Decision Estimation 424</p> <p>11.7.3 Estimation Based on the <i>A Posteriori</i> SNR 424</p> <p>11.8 VAD and Noise Estimation Techniques 425</p> <p>11.8.1 Voice Activity Detection 426</p> <p>11.8.2 Noise Estimation Using a Soft-Decision Detector 432</p> <p>11.8.3 Noise Power Estimation Based on Minimum Statistics 434</p> <p>11.9 Dual Channel Systems 443</p> <p>11.9.1 Noise Cancellation 449</p> <p>11.9.2 Noise Reduction 452</p> <p>11.9.3 Implementations of Dual Channel Noise Reduction Systems 453</p> <p>11.9.4 Combined Single and Dual Channel Noise Reduction 454</p> <p>Bibliography 456</p> <p><b>12 Multi-channel Noise Reduction 467</b></p> <p>12.1 Introduction 467</p> <p>12.2 Sound Waves 468</p> <p>12.3 Spatial Sampling of Sound Fields 470</p> <p>12.3.1 The Farfield Model 472</p> <p>12.3.2 The Uniform Linear Array 474</p> <p>12.3.3 Phase Ambiguity and Coherence 475</p> <p>12.3.4 Spatial Correlation Properties of Acoustic Signals 476</p> <p>12.4 Beamforming 477</p> <p>12.4.1 Delay-and-Sum Beamforming 477</p> <p>12.4.2 Filter-and-Sum Beamforming 478</p> <p>12.5 Performance Measures and Spatial Aliasing 481</p> <p>12.5.1 Array Gain and Array Sensitivity 481</p> <p>12.5.2 Directivity Pattern 482</p> <p>12.5.3 Directivity and Directivity Index 484</p> <p>12.5.4 Example: Differential Microphones 485</p> <p>12.6 Design of Fixed Beamformers 488</p> <p>12.6.1 Minimum Variance Distortionless Response Beamformer 488</p> <p>12.6.2 MVDR Beamformer with Limited Susceptibility 491</p> <p>12.7 Multi-channel Wiener Filter and Postfilter 493</p> <p>12.8 Adaptive Beamformers 495</p> <p>12.8.1 The Frost Beamformer 495</p> <p>12.8.2 Generalized Side-Lobe Canceller 498</p> <p>12.8.3 Generalized Side-lobe Canceller with Adaptive Blocking Matrix 500</p> <p>12.9 Optimal Non-linear Multi-channel Noise Reduction 501</p> <p>Bibliography 501</p> <p><b>13 Acoustic Echo Control 505</b></p> <p>13.1 The Echo Control Problem 505</p> <p>13.2 Evaluation Criteria 511</p> <p>13.3 The Wiener Solution 513</p> <p>13.4 The LMS and NLMS Algorithms 514</p> <p>13.4.1 Derivation and Basic Properties 514</p> <p>13.5 Convergence Analysis and Control of the LMS Algorithm 516</p> <p>13.5.1 Convergence in the Absence of Interference 517</p> <p>13.5.2 Convergence in the Presence of Interference 520</p> <p>13.5.3 Filter Order of the Echo Canceller 523</p> <p>13.5.4 Stepsize Parameter 524</p> <p>13.6 Geometric Projection Interpretation of the NLMS Algorithm 527</p> <p>13.7 The Affine Projection Algorithm 529</p> <p>13.8 Least-Squares and Recursive Least-Squares Algorithms 531</p> <p>13.8.1 The Weighted Least-Squares Algorithm 532</p> <p>13.8.2 The RLS Algorithm 533</p> <p>13.9 Block Processing and Frequency Domain Adaptive Filters 536</p> <p>13.9.1 Block LMS Algorithm 537</p> <p>13.9.2 The Exact Block NLMS Algorithm 537</p> <p>13.9.3 Frequency Domain Adaptive Filter (FDAF) 539</p> <p>13.9.4 Subband Acoustic Echo Cancellation 549</p> <p>13.10 Additional Measures for Echo Control 550</p> <p>13.10.1 Echo Canceller with Center Clipper 550</p> <p>13.10.2 Echo Canceller with Voice-Controlled Switching 551</p> <p>13.10.3 Echo Canceller with Adaptive Postfilter in the Time Domain 553</p> <p>13.10.4 Echo Canceller with Adaptive Postfilter in the Frequency Domain 554</p> <p>13.10.5 Initialization with Perfect Sequences 555</p> <p>13.11 Stereophonic Acoustic Echo Control 557</p> <p>13.11.1 The Non-uniqueness Problem 559</p> <p>13.11.2 Solutions to the Non-uniqueness Problem 559</p> <p>Bibliography 561</p> <p><b>Appendix A Codec Standards 569</b></p> <p>A.1 Evaluation Criteria 570</p> <p>A.2 ITU-T/G.726: Adaptive Differential Pulse Code Modulation (ADPCM) 572</p> <p>A.3 ITU-T/G.728: Low-Delay CELP Speech Coder 573</p> <p>A.4 ITU-T/G.729: Conjugate-Structure Algebraic CELP Codec 576</p> <p>A.5 ITU-T/G.722: 7 kHz Audio Coding within 64 kbit/s 579</p> <p>A.6 ETSI-GSM 06.10: Full Rate Speech Transcoding 580</p> <p>A.7 ETSI-GSM 06.20: Half Rate Speech Transcoding 582</p> <p>A.8 ETSI-GSM 06.60: Enhanced Full Rate Speech Transcoding 584</p> <p>A.9 ETSI-GSM 06.90: Adaptive Multi-Rate (AMR) Speech Transcoding 586</p> <p>A.10 ETSI/3GPP AMR Wideband Speech Transcoding 590</p> <p>A.11 ETSI/3GPP Extended AMR Wideband Codec, AMR-WB<sup>+</sup> 592</p> <p>A.12 TIA IS-96: Speech Service Option Standard for Wideband Spread-Spectrum Systems 594</p> <p>A.13 INMARSAT: Improved Multi-Band Excitation Codec (IMBE) 595</p> <p><b>Appendix B Speech Quality Assessment 597</b></p> <p>B.1 Auditive Speech Quality Measures 597</p> <p>B.2 Instrumental Speech Quality Measures 602</p> <p>Bibliography 604</p> <p>Index 607</p>
<p><strong>Peter Vary</strong> is the author of <em>Digital Speech Transmission: Enhancement, Coding and Error Concealment</em>, published by Wiley. <p><strong>Rainer Martin</strong> is the author of <em>Digital Speech Transmission: Enhancement, Coding and Error Concealment</em>, published by Wiley.
The enormous advances in digital signal processing (DSP) technology have contributed to the wide dissemination and success of speech communication devices - be it GSM and UMTS mobile telephones, digital hearing aids, or human-machine interfaces. Digital speech transmission techniques play an important role in these applications, all the more because high quality speech transmission remains essential in all current and next generation communication networks.<br> <br> Enhancement, coding and error concealment techniques improve the transmitted speech signal at all stages of the transmission chain, from the acoustic front-end to the sound reproduction at the receiver. Advanced speech processing algorithms help to mitigate a number of physical and technological limitations such as background noise, bandwidth restrictions, shortage of radio frequencies, and transmission errors.<br> <br> Digital Speech Transmission provides a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology. The authors give a solid, accessible overview of<br> * fundamentals of speech signal processing<br> * speech coding, including new speech coders for GSM and UMTS<br> * error concealment by soft decoding<br> * artificial bandwidth extension of speech signals<br> * single and multi-channel noise reduction<br> * acoustic echo cancellation<br> <br> This text is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.

Diese Produkte könnten Sie auch interessieren:

Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
EPUB ebook
114,99 €
Digital Communications with Emphasis on Data Modems
Digital Communications with Emphasis on Data Modems
von: Richard W. Middlestead
PDF ebook
171,99 €
Bandwidth Efficient Coding
Bandwidth Efficient Coding
von: John B. Anderson
PDF ebook
114,99 €