Muutke küpsiste eelistusi

E-raamat: Emotion Recognition using Speech Features

  • Formaat - PDF+DRM
  • Hind: 55,56 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

“Emotion Recognition Using Speech Features” provides coverage of emotion-specific features present in speech. The author also discusses suitable models for capturing emotion-specific information for distinguishing different emotions. The content of this book is important for designing and developing natural and sophisticated speech systems. In this Brief, Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about exploiting multiple evidences derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Features includes discussion of:• Global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; • Exploiting complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance; • Proposed multi-stage and hybrid models for improving the emotion recognition performance. This brief is for researchers working in areas related to speech-based products such as mobile phone manufacturing companies, automobile companies, and entertainment products as well as researchers involved in basic and applied speech processing research.
1 Introduction
1(14)
1.1 Emotion: Psychological Perspective
2(1)
1.2 Emotion: Speech Signal Perspective
3(5)
1.2.1 Speech Production Mechanism
4(1)
1.2.2 Source Features
5(1)
1.2.3 System Features
5(2)
1.2.4 Prosodic Features
7(1)
1.3 Emotional Speech Databases
8(2)
1.4 Applications of Speech Emotion Recognition
10(1)
1.5 Issues in Speech Emotion Recognition
10(1)
1.6 Objectives and Scope of the Work
11(1)
1.7 Main Highlights of Research Investigations
12(1)
1.8 Brief Overview of Contributions to This Book
12(1)
1.8.1 Emotion Recognition Using Excitation Source Information
12(1)
1.8.2 Emotion Recognition Using Vocal Tract Information
12(1)
1.8.3 Emotion Recognition Using Prosodic Information
13(1)
1.9 Organization of the Book
13(2)
2 Speech Emotion Recognition: A Review
15(20)
2.1 Introduction
15(1)
2.2 Emotional Speech Corpora: A Review
16(7)
2.3 Excitation Source Features: A Review
23(2)
2.4 Vocal Tract System Features: A Review
25(2)
2.5 Prosodic Features: A Review
27(1)
2.6 Classification Models
28(4)
2.7 Motivation for the Present Work
32(1)
2.8 Summary of the Literature and Scope for the Present Work
33(2)
3 Emotion Recognition Using Excitation Source Information
35(32)
3.1 Introduction
35(1)
3.2 Motivation
36(3)
3.3 Emotional Speech Corpora
39(4)
3.3.1 Indian Institute of Technology Kharagpur-Simulated Emotional Speech Corpus: IITKGP-SESC
40(2)
3.3.2 Berlin Emotional Speech Database: Emo-DB
42(1)
3.4 Excitation Source Features for Emotion Recognition
43(10)
3.4.1 Higher-Order Relations Among LP Residual Samples
43(1)
3.4.2 Phase of LP Residual Signal
44(2)
3.4.3 Parameters of the Instants of Glottal Closure (Epoch Parameters)
46(4)
3.4.4 Dynamics of Epoch Parameters at Syllable Level
50(1)
3.4.5 Dynamics of Epoch Parameters at Utterance Level
51(1)
3.4.6 Glottal Pulse Parameters
52(1)
3.5 Classification Models
53(3)
3.5.1 Auto-associative Neural Networks
53(1)
3.5.2 Support Vector Machines
54(2)
3.6 Results and Discussion
56(10)
3.7 Summary
66(1)
4 Emotion Recognition Using Vocal Tract Information
67(12)
4.1 Introduction
67(1)
4.2 Feature Extraction
68(4)
4.2.1 Linear Prediction Cepstral Coefficients (LPCCs)
69(1)
4.2.2 Mel Frequency Cepstral Coefficients (MFCCs)
70(1)
4.2.3 Formant Features
71(1)
4.3 Classifiers
72(1)
4.3.1 Gaussian Mixture Models (GMM)
72(1)
4.4 Results and Discussion
73(4)
4.5 Summary
77(2)
5 Emotion Recognition Using Prosodic Information
79(14)
5.1 Introduction
79(1)
5.2 Prosodic Features: Importance in Emotion Recognition
80(2)
5.3 Motivation
82(3)
5.4 Extraction of Global and Local Prosodic Features
85(2)
5.5 Results and Discussion
87(4)
5.6 Summary
91(2)
6 Summary and Conclusions
93(6)
6.1 Summary of the Present Work
93(2)
6.2 Contributions of the Present Work
95(1)
6.3 Conclusions from the Present Work
95(1)
6.4 Scope for Future Work
95(4)
A Linear Prediction Analysis of Speech
99(6)
A.1 The Prediction Error Signal
101(1)
A.2 Estimation of Linear Prediction Coefficients
102(3)
B MFCC Features
105(4)
C Gaussian Mixture Model (GMM)
109(6)
C.1 Training the GMMs
110(3)
C.1.1 Expectation Maximization (EM) Algorithm
110(1)
C.1.2 Maximum A Posteriori (MAP) Adaptation
111(2)
C.2 Testing
113(2)
References 115
K. Sreenivasa Rao is at the Indian Institute of Technology, Kharagpur, India. Shashidhar G, Koolagudi is at the Graphic Era University, Dehradun, India.