Muutke küpsiste eelistusi

DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement: A Survey of the State of the Art [Pehme köide]

  • Formaat: Paperback / softback, 80 pages, kõrgus x laius: 235x191 mm, kaal: 171 g
  • Sari: Synthesis Lectures on Speech and Audio Processing
  • Ilmumisaeg: 01-Jan-2013
  • Kirjastus: Morgan and Claypool Life Sciences
  • ISBN-10: 1627051430
  • ISBN-13: 9781627051439
  • Formaat: Paperback / softback, 80 pages, kõrgus x laius: 235x191 mm, kaal: 171 g
  • Sari: Synthesis Lectures on Speech and Audio Processing
  • Ilmumisaeg: 01-Jan-2013
  • Kirjastus: Morgan and Claypool Life Sciences
  • ISBN-10: 1627051430
  • ISBN-13: 9781627051439
As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement.

The field of single-microphone noise reduction for speech enhancement comprises a history of more than 30 years of research. In this survey, we wish to demonstrate the significant advances that have been made during the last decade in the field of discrete Fourier transform domain-based single-channel noise reduction for speech enhancement.

Furthermore, our goal is to provide a concise description of a state-of-the-art speech enhancement system, and demonstrate the relative importance of the various building blocks of such a system. This allows the non-expert DSP practitioner to judge the relevance of each building block and to implement a close-to-optimal enhancement system for the particular application at hand.
Acknowledgments ix
Glossary xi
1 Introduction
1(4)
2 Single Channel Speech Enhancement-General Principles
5(8)
2.1 Analysis-Modification-Synthesis (AMS) System
6(1)
2.2 Finding the Target Estimate
7(1)
2.3 A priori Knowledge and Assumptions
8(5)
2.3.1 Taking Speech Signal Characteristics into Account
8(2)
2.3.2 Taking Noise Process Characteristics into Account
10(1)
2.3.3 Taking the Human Auditory System into Account
10(3)
3 DFT-Based Speech Enhancement Methods-Signal Model and Notation
13(2)
4 Speech DFT Estimators
15(8)
4.1 Statistical Modeling Assumptions
15(1)
4.2 Spectral Subtraction
16(2)
4.3 Linear MMSE Estimators
18(1)
4.4 Non-linear MMSE Estimators
18(5)
5 Speech Presence Probability Estimation
23(6)
5.1 A posteriori Speech Presence Probability
23(2)
5.2 Estimation of the Model Parameter ξH1
25(1)
5.2.1 Short-term Adaptive Estimate
25(1)
5.2.2 Fixed Optimal ξH1
25(1)
5.3 Choosing the Prior Probabilities
26(2)
5.3.1 Adaptive Prior Probabilities
26(2)
5.3.2 Fixed Prior Probabilities
28(1)
5.4 Avoiding Outliers
28(1)
6 Noise PSD Estimation
29(8)
6.1 Methods Based on VAD
29(1)
6.2 Methods Based on Minimum Power Level Tracking
30(3)
6.3 SPP-Based Noise PSD Estimation
33(2)
6.4 MMSE-Based Estimation of the Noise PSD
35(1)
6.5 DFT-subspace Estimation of the Noise PSD
35(2)
7 Speech PSD Estimation
37(6)
7.1 Maximum Likelihood Estimation and Decision-directed Approach
37(1)
7.2 Kalman-type Filtering, GARCH Modeling, and Noncausal Estimation
38(1)
7.3 Temporal Cepstrum Smoothing
39(1)
7.4 Comparison of the Estimators
40(3)
8 Performance Evaluation Methods
43(6)
8.1 Evaluating Quality Aspects of Enhanced Speech
44(2)
8.1.1 Listening Tests
44(1)
8.1.2 Instrumental Test Methods
44(2)
8.2 Evaluating Intelligibility of Enhanced Speech
46(3)
8.2.1 Listening Tests
46(1)
8.2.2 Instrumental Test Methods
47(2)
9 Simulation Experiments with Single-Channel Enhancement Systems
49(6)
10 Future Directions
55(2)
References 57(12)
Authors' Biographies 69