Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

Cognitively Inspired Audiovisual Speech Filtering: Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System 2015 ed. [Pehme köide]

3.50/5 (4 hinnangut Goodreads-ist)

Andrew Abel, Amir Hussain

Formaat: Paperback / softback, 121 pages, kõrgus x laius: 235x155 mm, kaal: 2488 g, 37 Illustrations, color; 4 Illustrations, black and white; XVIII, 121 p. 41 illus., 37 illus. in color., 1 Paperback / softback
Sari: SpringerBriefs in Cognitive Computation 5
Ilmumisaeg: 19-Aug-2015
Kirjastus: Springer International Publishing AG
ISBN-10: 3319135082
ISBN-13: 9783319135083

Teised raamatud teemal:

Image processing
Natural language & machine translation
Graphical & digital media applications - (Hetkel poes: 4 nimetust)

Pehme köide
Hind: 48,70 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Tavahind: 57,29 €
Säästad 15%
Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
Kogus:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Lisa ostukorvi
Tasuta tarne
Tellimisaeg 2-4 nädalat
Lisa soovinimekirja

Formaat: Paperback / softback, 121 pages, kõrgus x laius: 235x155 mm, kaal: 2488 g, 37 Illustrations, color; 4 Illustrations, black and white; XVIII, 121 p. 41 illus., 37 illus. in color., 1 Paperback / softback
Sari: SpringerBriefs in Cognitive Computation 5
Ilmumisaeg: 19-Aug-2015
Kirjastus: Springer International Publishing AG
ISBN-10: 3319135082
ISBN-13: 9783319135083

Teised raamatud teemal:

Image processing
Natural language & machine translation
Graphical & digital media applications - (Hetkel poes: 4 nimetust)

Püsilink: https://www.kriso.ee/db/9783319135083.html

Märksõnad:

This book presents a summary of the cognitively inspired basis behind multimodal speech enhancement, covering the relationship between audio and visual modalities in speech, as well as recent research into audiovisual speech correlation. A number of audiovisual speech filtering approaches that make use of this relationship are also discussed. A novel multimodal speech enhancement system, making use of both visual and audio information to filter speech, is presented, and this book explores the extension of this system with the use of fuzzy logic to demonstrate an initial implementation of an autonomous, adaptive, and context aware multimodal system. This work also discusses the challenges presented with regard to testing such a system, the limitations with many current audiovisual speech corpora, and discusses a suitable approach towards development of a corpus designed to test this novel, cognitively inspired, speech filtering system.

Arvustused

1 Introduction

1.1 Multimodal Speech Enhancement

(1)

1.2 Cognitively Inspired Intelligent Flexibility

(3)

References

(2)

2 Audio and Visual Speech Relationship

(8)

2.1 Audio and Visual Speech Production

(1)

2.1.1 Speech Production

(1)

2.1.2 Phonemes and Visemes

(1)

2.2 Multimodal Speech Phenomena

(1)

2.2.1 Cocktail Party Problem

(1)

2.2.2 McGurk Effect

(1)

2.2.3 Lombard Effect

(1)

2.3 Audiovisual Speech Correlation Background

(1)

2.4 Multimodal Correlation Analysis

(5)

2.4.1 Correlation Measurement

(1)

2.4.2 Multimodal Correlation Analysis Results

(2)

References

(2)

3 The Research Context

(22)

3.1 Application of Speech Processing Techniques to Hearing Aids

(4)

3.1.1 Directional Microphones

(2)

3.1.2 Noise Cancelling Algorithms

(1)

3.2 Audiovisual Speech Enhancement Techniques

(8)

3.2.1 Background

(1)

3.2.2 Audiovisual Blind Source Separation

(2)

3.2.3 Multimodal Fragment Decoding

(2)

3.2.4 Visually Derived Wiener Filtering

(3)

3.3 Visual Tracking and Detection

(3)

3.3.1 Lip Tracking

(1)

3.3.2 Region of Interest Detection

(2)

3.4 Audiovisual Speech Corpora

(7)

3.4.1 The BANCA Speech Database

(1)

3.4.2 The Extended M2VTS Database

(1)

3.4.3 The AVICAR Speech Database

(1)

3.4.4 The VidTIMIT Multimodal Database

(1)

3.4.5 The GRID Corpus

(1)

References

(5)

4 A Two Stage Multimodal Speech Enhancement System

(18)

4.1 Overall Design Framework of the Two-Stage Multimodal System

(1)

4.2 Reverberant Room Environment

(1)

4.3 Multiple Microphone Array

(1)

4.4 Audio Feature Extraction

(2)

4.5 Visual Feature Extraction

(5)

4.6 Visually Derived Wiener Filtering

(1)

4.7 Gaussian Mixture Model for Audiovisual Clean Speech Estimation

(2)

4.8 Beamforming

(5)

References

(3)

5 Experiments, Results, and Analysis

(22)

5.1 Speech Enhancement Evaluation Approaches

(3)

5.1.1 Subjective Speech Quality Evaluation Measures

(1)

5.1.2 Objective Speech Quality Evaluation Measures

(2)

5.2 Preliminary Experimentation

(1)

5.3 Automated Lip Detection Evaluation

(3)

5.3.1 Problem Description

(1)

5.3.2 Experiment Setup

(1)

5.3.3 Results and Discussion

(2)

5.4 Noisy Audio Environments

(8)

5.4.1 Problem Description

(1)

5.4.2 Experiment Setup

(1)

5.4.3 Results and Discussion

(7)

5.5 Testing with Novel Corpus

(3)

5.5.1 Problem Description

(1)

5.5.2 Experiment Setup

(1)

5.5.3 Results and Discussion

(2)

5.6 Inconsistent Audio Environment

(5)

5.6.1 Problem Description

(1)

5.6.2 Experiment Setup

(1)

5.6.3 Results and Discussion

(2)

References

(2)

6 Towards Fuzzy Logic Based Multimodal Speech Filtering

(16)

6.1 Limitations of Current Two-Stage System

(1)

6.2 Fuzzy Logic Based Model Justification

(3)

6.2.1 Requirements of Autonomous, Adaptive, and Context Aware Speech Filtering

(1)

6.2.2 Fuzzy Logic Based Decision Making

(1)

6.3 Potential Alternative Approaches

(2)

6.3.1 Hidden Markov Models

(1)

6.3.2 Neural Networks

(1)

6.4 Fuzzy Based Multimodal Speech Enhancement Framework

(10)

6.4.1 Overall Design Framework of Fuzzy System

(1)

6.4.2 Fuzzy Logic Based Framework Inputs

(4)

6.4.3 Fuzzy Logic Based Switching Supervisor

(3)

References

(2)

7 Evaluation of Fuzzy Logic Proof of Concept

(20)

7.1 Testing Requirements

(1)

7.2 Experimentation Limitations

(1)

7.3 Recording of Challenging Audiovisual Speech Corpus

(3)

7.3.1 Corpus Configuration

(3)

7.4 Fuzzy Input Variable Evaluation

(6)

7.4.1 Visual Quality Fuzzy Indicator

(4)

7.4.2 Previous Frame Fuzzy Input Variable

(2)

7.5 Detailed System Evaluation

101

(9)

7.5.1 Problem Description

101

(1)

7.5.2 Experiment Setup

102

(1)

7.5.3 Subjective Testing with Broadband Noise

103

(2)

7.5.4 Detailed Fuzzy Switching Performance

105

(5)

7.6 Discussion of Results

110

(1)

References

110

(1)

8 Potential Future Research Directions

111

(4)

8.1 Improvement of Individual Speech Processing Components

111

(1)

8.2 Extension of Overall Speech Filtering Framework

112

(1)

8.3 Further Development of Fuzzy Logic Based Switching Controller

112

(1)

8.4 Practical Implementation of System

113

(2)

References

114

(1)

Index

115

Cognitively Inspired Audiovisual Speech Filtering: Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System 2015 ed. [Pehme köide]

Arvustused

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad Ingliskeelsed raamatud

Vali ostukorv