|
|
1 | (14) |
|
1.1 Emotion: Psychological Perspective |
|
|
2 | (1) |
|
1.2 Emotion: Speech Signal Perspective |
|
|
3 | (5) |
|
1.2.1 Speech Production Mechanism |
|
|
4 | (1) |
|
|
5 | (1) |
|
|
5 | (2) |
|
|
7 | (1) |
|
1.3 Emotional Speech Databases |
|
|
8 | (2) |
|
1.4 Applications of Speech Emotion Recognition |
|
|
10 | (1) |
|
1.5 Issues in Speech Emotion Recognition |
|
|
10 | (1) |
|
1.6 Objectives and Scope of the Work |
|
|
11 | (1) |
|
1.7 Main Highlights of Research Investigations |
|
|
12 | (1) |
|
1.8 Brief Overview of Contributions to This Book |
|
|
12 | (1) |
|
1.8.1 Emotion Recognition Using Excitation Source Information |
|
|
12 | (1) |
|
1.8.2 Emotion Recognition Using Vocal Tract Information |
|
|
12 | (1) |
|
1.8.3 Emotion Recognition Using Prosodic Information |
|
|
13 | (1) |
|
1.9 Organization of the Book |
|
|
13 | (2) |
|
2 Speech Emotion Recognition: A Review |
|
|
15 | (20) |
|
|
15 | (1) |
|
2.2 Emotional Speech Corpora: A Review |
|
|
16 | (7) |
|
2.3 Excitation Source Features: A Review |
|
|
23 | (2) |
|
2.4 Vocal Tract System Features: A Review |
|
|
25 | (2) |
|
2.5 Prosodic Features: A Review |
|
|
27 | (1) |
|
2.6 Classification Models |
|
|
28 | (4) |
|
2.7 Motivation for the Present Work |
|
|
32 | (1) |
|
2.8 Summary of the Literature and Scope for the Present Work |
|
|
33 | (2) |
|
3 Emotion Recognition Using Excitation Source Information |
|
|
35 | (32) |
|
|
35 | (1) |
|
|
36 | (3) |
|
3.3 Emotional Speech Corpora |
|
|
39 | (4) |
|
3.3.1 Indian Institute of Technology Kharagpur-Simulated Emotional Speech Corpus: IITKGP-SESC |
|
|
40 | (2) |
|
3.3.2 Berlin Emotional Speech Database: Emo-DB |
|
|
42 | (1) |
|
3.4 Excitation Source Features for Emotion Recognition |
|
|
43 | (10) |
|
3.4.1 Higher-Order Relations Among LP Residual Samples |
|
|
43 | (1) |
|
3.4.2 Phase of LP Residual Signal |
|
|
44 | (2) |
|
3.4.3 Parameters of the Instants of Glottal Closure (Epoch Parameters) |
|
|
46 | (4) |
|
3.4.4 Dynamics of Epoch Parameters at Syllable Level |
|
|
50 | (1) |
|
3.4.5 Dynamics of Epoch Parameters at Utterance Level |
|
|
51 | (1) |
|
3.4.6 Glottal Pulse Parameters |
|
|
52 | (1) |
|
3.5 Classification Models |
|
|
53 | (3) |
|
3.5.1 Auto-associative Neural Networks |
|
|
53 | (1) |
|
3.5.2 Support Vector Machines |
|
|
54 | (2) |
|
3.6 Results and Discussion |
|
|
56 | (10) |
|
|
66 | (1) |
|
4 Emotion Recognition Using Vocal Tract Information |
|
|
67 | (12) |
|
|
67 | (1) |
|
|
68 | (4) |
|
4.2.1 Linear Prediction Cepstral Coefficients (LPCCs) |
|
|
69 | (1) |
|
4.2.2 Mel Frequency Cepstral Coefficients (MFCCs) |
|
|
70 | (1) |
|
|
71 | (1) |
|
|
72 | (1) |
|
4.3.1 Gaussian Mixture Models (GMM) |
|
|
72 | (1) |
|
4.4 Results and Discussion |
|
|
73 | (4) |
|
|
77 | (2) |
|
5 Emotion Recognition Using Prosodic Information |
|
|
79 | (14) |
|
|
79 | (1) |
|
5.2 Prosodic Features: Importance in Emotion Recognition |
|
|
80 | (2) |
|
|
82 | (3) |
|
5.4 Extraction of Global and Local Prosodic Features |
|
|
85 | (2) |
|
5.5 Results and Discussion |
|
|
87 | (4) |
|
|
91 | (2) |
|
6 Summary and Conclusions |
|
|
93 | (6) |
|
6.1 Summary of the Present Work |
|
|
93 | (2) |
|
6.2 Contributions of the Present Work |
|
|
95 | (1) |
|
6.3 Conclusions from the Present Work |
|
|
95 | (1) |
|
6.4 Scope for Future Work |
|
|
95 | (4) |
|
A Linear Prediction Analysis of Speech |
|
|
99 | (6) |
|
A.1 The Prediction Error Signal |
|
|
101 | (1) |
|
A.2 Estimation of Linear Prediction Coefficients |
|
|
102 | (3) |
|
|
105 | (4) |
|
C Gaussian Mixture Model (GMM) |
|
|
109 | (6) |
|
|
110 | (3) |
|
C.1.1 Expectation Maximization (EM) Algorithm |
|
|
110 | (1) |
|
C.1.2 Maximum A Posteriori (MAP) Adaptation |
|
|
111 | (2) |
|
|
113 | (2) |
References |
|
115 | |