Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Robustness in Automatic Speech Recognition: Fundamentals and Applications 1996 ed. [Kõva köide]

Jean-Claude Junqua, Jean-Paul Haton

Formaat: Hardback, 440 pages, kõrgus x laius: 234x156 mm, kaal: 1860 g, XXX, 440 p., 1 Hardback
Sari: The Springer International Series in Engineering and Computer Science 341
Ilmumisaeg: 31-Oct-1995
Kirjastus: Springer
ISBN-10: 0792396464
ISBN-13: 9780792396468

Teised raamatud teemal:

Kõva köide
Hind: 187,67 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Tavahind: 220,79 €
Säästad 15%
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Hardback, 440 pages, kõrgus x laius: 234x156 mm, kaal: 1860 g, XXX, 440 p., 1 Hardback
Sari: The Springer International Series in Engineering and Computer Science 341
Ilmumisaeg: 31-Oct-1995
Kirjastus: Springer
ISBN-10: 0792396464
ISBN-13: 9780792396468

Teised raamatud teemal:

Püsilink: https://www.kriso.ee/db/9780792396468.html

Märksõnad:

Provides a unified view of improving the capability of machines to recognize speech so that it will be reliable in the changing and often unpredictable conditions in real use; so far it works very well in the laboratory but not as consistently as consumers demand elsewhere. Covers the problems of speech production and perception in noise, popular techniques used in speech analysis and automatic speech recognition, problems relevant to robustness and speech-based applications, variability between and within speakers, various types of distorted speech, and recent advances in dealing with such problems. Not intended, but suitable for a graduate or undergraduate course. Annotation copyright Book News, Inc. Portland, Or.

The domain of speech processing has come to the point where researchers and engineers are concerned with how speech technology can be applied to new products, and how this technology will transform our future. One important problem is to improve robustness of speech processing under adverse conditions, which is the subject of this book. Robust speech processing is a relatively new area which became a concern as technology started moving from laboratory to field applications. A method or an algorithm is robust if it can deal with a broad range of applications and adapt to unknown conditions. Robustness in Automatic Speech Recognition addresses all of the fundamental problems and issues in the area. The book is divided into three parts. The first provides the background necessary for understanding the rest of the material. It also emphasizes the problems of speech production and perception in noise along with popular techniques used in speech analysis and automatic speech recognition. Part Two discusses the problems relevant to robustness in automatic speech recognition and speech-based applications. It emphasizes intra- and inter-speaker variability as well as automatic speech recognition of Lombard, noisy and channel distorted speech. Finally, the third part covers recent advances in the field of robust automatic speech recognition. Audience: An invaluable reference. May be used as a text for advanced courses on the subject.

About the authors

xxiii

(2)

Foreword

xxv

(2)

Preface

xxvii

(2)

Acknowledgments

xxix

Part A SPEECH COMMUNICATION BY HUMANS AND MACHINES

(124)

Chapter 1 NATURE AND PERCEPTION OF SPEECH SOUNDS

(34)

1.1 SPEECH PRODUCTION

(5)

1.1.1 The speech apparatus

(1)

1.1.2 Articulatory phonetics

(1)

1.1.3 Articulatory models

(1)

1.1.4 Production of speech in noise

(2)

1.2 ACOUSTIC PHONETICS

(15)

1.2.1 Representations of Speech

(1)

1.2.2 Phonemes and allophones

(1)

1.2.3 Vowels

(4)

1.2.4 Consonants

(5)

1.2.5 Acoustic-phonetic changes due to the Lombard reflex

(4)

1.3 HEARING AND PERCEPTION

(13)

1.3.1 The auditory system

(3)

1.3.2 Perception of sounds

(3)

1.3.3 Influence of the Lombard reflex on speech perception

(7)

Chapter 2 BACKGROUND ON SPEECH ANALYSIS

(36)

2.1 PRINCIPLES AND AIMS OF SPEECH ANALYSIS METHODS

(3)

2.1.1 Introduction

(1)

2.1.2 The Fourier transforms

(1)

2.1.3 Digital filter-banks

(1)

2.2 SPEECH ANALYSIS BASED ON A PRODUCTION MODEL

(4)

2.2.1 Introduction to the linear prediction analysis

(1)

2.2.2 The LPC Model

(2)

2.2.3 Spectral modeling using LPC

(1)

2.3 FEATURE ANALYSIS

(4)

2.3.1 Introduction

(1)

2.3.2 Typical LPC parameters used in recognition

(4)

2.3.3 Vector quantization

(1)

2.4 TIME-FREQUENCY REPRESENTATIONS OF SPEECH

(2)

2.5 WAVELETS

(3)

2.6 HIGHER-ORDER SPECTRAL ANALYSIS

(2)

2.7 SPEECH ANALYSIS BASED ON AUDITORY MODELS

(5)

2.7.1 Introduction

(2)

2.7.2 Physiological and psychoacoustic models

(2)

2.7.3 Application to ASR

(1)

2.8 LIMITS OF STANDARD ANALYSES IN PRESENCE OF NOISE

(11)

Chapter 3 FUNDAMENTALS OF AUTOMATIC SPEECH RECOGNITION

(52)

3.1 PRELIMINARIES

(6)

3.1.1 Basic principles

(3)

3.1.2 Historical background

(3)

3.2 DISTANCE MEASURES

(4)

3.2.1 Introduction

(1)

3.2.2 Spectral distance measures

(2)

3.2.3 Distance measures and speech perception

(1)

3.3 PATTERN RECOGNITION METHODS FOR ASR

(28)

3.3.1 Basic principles

(1)

3.3.2 Time normalization

(5)

3.3.3 Stochasti modeling

(12)

3.3.4 Neural networks

102

(10)

3.4 SPEAKER-DEPENDENT AND SPEKER-INDEPENDENT RECOGNITION

112

(1)

3.4.1 Introdution

112

(1)

3.4.2 Template seletion in pattern reognition ASR systems

113

(1)

3.5 PERFORMING FINE DISTINCTIONS IN ASR

113

(12)

Part B ROBUSTNESS IN ASR: PROBLEMS AND ISSUES

125

(66)

Chapter 4 SPEAKER VARIABILITY AND SPECIFICITY

127

(28)

4.1 VARIANTS OF SPEECH AND SPEAKING STYLES

128

(10)

4.1.1 Introduction

128

(4)

4.1.2 Read versus spontaneous speech

132

(1)

4.1.3 Stress and emotion in speech

132

(2)

4.1.4 Male-female differences

134

(1)

4.1.5 Voice conversion

135

(5)

4.1.6 Available databases to study speaking styles

137

(1)

4.2 VARIABILITY AND INVARIANCE

138

(17)

4.2.1 Preliminaries

138

(3)

4.2.2 Personal variation or intra-speaker variability

141

(1)

4.2.3 Inter-speaker variability

142

(1)

4.2.4 Environment variability

143

(1)

4.2.5 Linguistic variability

144

(1)

4.2.6 Contextual variation

145

(1)

4.2.7 Robust phonetic features in the presence of noise

146

(1)

4.2.8 Relational invariance

147

(8)

Chapter 5 DEALING WITH NOISY SPEECH AND CHANNEL DISTORTIONS

155

(36)

5.1 TYPICAL NOISE SOURCES AND CHANNEL DISTORTIONS

156

(11)

5.1.1 Preliminiaries

156

(2)

5.1.2 Signal-to-noise ratio evaluation

158

(2)

5.1.3 General assumptions

160

(1)

5.1.4 Characteristics of some common noises

161

(6)

5.2 EFFECTS OF ADDITIVE NOISE ON SPEECH

167

(1)

5.3 HUMAN PERFORMANCE FOR SPEECH IN NOISE

168

(3)

5.4 SOME ISSUES IN ASR OF NOISY SPEECH

171

(8)

5.4.1 Introduction and specific difficulties

171

(2)

5.4.2 Endpoint detection

173

(6)

5.5 THE LOMBARD REFLEX AND ITS INCIDENCE ON ASR SYSTEMS

179

(12)

5.5.1 Preliminaries

179

(1)

5.5.2 ASR of Lombard speech

180

(11)

Part C POSSIBLE SOLUTIONS AND SOME PERSPECTIVES

191

(238)

Chapter 6 THE CURRENT TECHNOLOGY AND ITS LIMITS: AN OVERVIEW

193

(14)

6.1 INTRODUCTION

194

(1)

6.2 WHERE WE ARE TODAY AND WHERE TECHNOLOGY IS HEADING

194

(7)

6.2.1 Current technology

194

(2)

6.2.2 Real challenges

196

(4)

6.2.3 Some reasons for today's limitations

200

(1)

6.3 SPEECH RECOGNITION BY HUMAN LISTENERS AND MACHINES

201

(2)

6.4 OVERVIEW OF RECENT ADVANCES IN ROBUST SPEECH PROCESSING

203

(4)

Chapter 7 TOWARDS ROBUST SPEECH ANALYSIS

207

(26)

7.1 PRELIMINARIES

208

(1)

7.2 SIGNAL ACQUISITION

208

(4)

7.3 ROBUST SPEECH ANALYSIS

212

(21)

7.3.1 On the use of auditory models for better speech analysis

212

(3)

7.3.2 Robust spectral estimation and ARMA models

225

(8)

Chapter 8 ON THE USE OF A ROBUST SPEECH REPRESENTATION

233

(40)

8.1 INTRODUCTION

234

(1)

7.2 FEATURE EXTRACTION

235

(20)

8.2.1 Time derivatives of speech

235

(6)

8.2.2 AR modeling in the autocorrelation domain

241

(2)

8.2.3 Feature processing

243

(7)

8.2.4 Feature transformation

250

(4)

8.2.5 Feature estimation in noise

254

(1)

8.2.6 Other techniques providing improved features

255

(1)

8.3 NOISE-ROBUST DISTORTION AND SIMILARITY MEASURES

255

(18)

8.3.1 Cepstral lifters

255

(2)

8.3.2 Robust distortion measures

257

(4)

8.3.3 Discriminative similarity measures

261

(12)

Chapter 9 ASR OF NOISY, STRESSED, AND CHANNEL DISTORTED SPEECH

273

(52)

9.1 INTRODUCTION

274

(2)

9.2 SPEECH ENHANCEMENT

276

(14)

9.2.1 Filtering techniques

276

(3)

9.2.2 Signal estimation techniques based on statistical modeling for speech enhancement

279

(2)

9.2.3 Linear and non-linear spectral subtraction

281

(5)

9.2.4 Signal restoration via a mapping transformation

286

(4)

9.3 MODEL COMPENSATION

290

(35)

9.3.1 HMM composition and decomposition

290

(4)

9.3.2 Noise masking, data contamination, and noise immunity learning

294

(1)

9.3.3 Adaptation techniques for noisy speech recognition

295

(8)

9.3.4 Minimum error training

303

(4)

9.3.5 Stress and channel compensation

307

(5)

9.3.6 Concluding remarks

312

(13)

Chapter 10 WORD-SPOTTING AND REJECTION

325

(22)

10.1 WORD-SPOTTING VERSUS ENDPOINT-BASED RECOGNITION

326

(12)

10.1.1 Preliminaries

326

(3)

10.1.2 Template matching word-spotters

329

(1)

10.1.3 Training garbage (or filler) models

330

(2)

10.1.4 Word-spotting and large vocabulary recognition

332

(1)

10.1.5 Vocabulary-independent word-spotting and user-defined keywords

332

(1)

10.1.6 Performance measures

333

(1)

10.1.7 Post word-spotting processing and rejection

334

(3)

10.1.8 Examples of word-spotting applications

337

(1)

10.2 CONFIDENCE MEASURES AND THE NEW WORD PROBLEM

338

(9)

10.2.1 Recognition confidence measures

338

(1)

10.2.1 Detecting out-of-vocabulary words and adding new words

339

(8)

Chapter 11 SPONTANEOUS SPEECH

347

(24)

11.1 INTRODUCTION

348

(3)

11.2 THE ATIS DATABASE AND SPONTANEOUS SPEECH CORPORA

351

(2)

11.3 THE SPEECH RECOGNITION-NATURAL LANGUAGE INTERFACE

353

(3)

11.4 THE LANGUAGE MODEL

356

(4)

11.5 ROBUST PARSING AND INTERPRETATION

360

(11)

Chapter 12 ON THE USE OF KNOWLEDGE IN ASR

371

(22)

12.1 STATEMENT OF THE PROBLEM

372

(1)

12.2 HYBRID MODELS FOR ASR

373

(6)

12.2.1 Preliminaries

373

(1)

12.2.2 Hybrid data-based approaches

374

(3)

ORION: A hybrid system for isolated word recognition

377

(2)

12.3 MODELS FOR COOPERATION BETWEEN KNOWLEDGE SOURCES

379

(2)

12.3.1 Statement of the problem

379

(1)

12.3.2 Bottom-up versus top-down processing

379

(1)

12.3.3 Heterarchical models for ASR

380

(1)

12.4 DEDUCTIVE AND ABDUCTIVE REASONING MODELS FOR ASR

381

(7)

12.4.1 Use of a production rule model

381

(3)

12.4.2 Truth maintenance and abduction

384

(4)

12.5 CONCLUSION

388

(5)

Chapter 13 APPLICATION DOMAIN, HUMAN FACTORS, AND DIALOGUE

393

(36)

13.1 THE APPLICATION DOMAIN

394

(2)

13.2 HUMAN FACTORS AND USER INTERFACE

396

(2)

13.3 DIALOGUE FOR IMPROVED ROBUSTNESS

398

(6)

13.3.1 Beyond sentences and turn talking: towards a natural interaction

398

(1)

13.3.2 Dialogue context and error correction

399

(1)

13.3.3 Multimodal dialogue systems

400

(1)

13.3.4 Different dialogue strategies for different applications

401

(3)

13.4 APPLICATION-INDEPENDENCE AND FAST PROTOTYPING

404

(5)

13.4.1 Introduction

404

(1)

13.4.2 Vocabulary-independent recognition

404

(3)

13.4.3 Application-independent dialogue strategies

407

(1)

13.4.4 The notion of global speech interface

408

(1)

13.5 THE ASSESSMENT AND ITS DIFFICULTIES

409

(3)

13.6 A ROBUST REAL-WORLD APPLICATION

412

(9)

13.6.1 Introduction

412

(1)

13.6.2 TOBIE-SOL: A conversational system for a weak-sighted operator

413

(8)

APPLICATION PERSPECTIVES FOR THE YEAR 2000

421

(8)

Appendix

429

(2)

Index

431

Robustness in Automatic Speech Recognition: Fundamentals and Applications 1996 ed. [Kõva köide]

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv