Preface |
|
xiii | |
Acknowledgements |
|
xv | |
|
|
xvii | |
|
|
|
|
3 | (18) |
|
1.1 What is Computational Paralinguistics? A First Approximation |
|
|
3 | (4) |
|
1.2 History and Subject Area |
|
|
7 | (3) |
|
|
10 | (2) |
|
|
12 | (5) |
|
1.4.1 The Synthesis of Emotion and Personality |
|
|
12 | (1) |
|
1.4.2 Multimodality: Analysis and Generation |
|
|
13 | (2) |
|
1.4.3 Applications, Usability and Ethics |
|
|
15 | (2) |
|
1.5 Summary and Structure of the Book |
|
|
17 | (4) |
|
|
18 | (3) |
|
|
21 | (32) |
|
|
21 | (4) |
|
2.2 Acted versus Spontaneous |
|
|
25 | (5) |
|
2.3 Complex versus Simple |
|
|
30 | (1) |
|
2.4 Measured versus Assessed |
|
|
31 | (2) |
|
2.5 Categorical versus Continuous |
|
|
33 | (2) |
|
2.6 Felt versus Perceived |
|
|
35 | (2) |
|
2.7 Intentional versus Instinctual |
|
|
37 | (1) |
|
2.8 Consistent versus Discrepant |
|
|
38 | (1) |
|
2.9 Private versus Social |
|
|
39 | (1) |
|
2.10 Prototypical versus Peripheral |
|
|
40 | (1) |
|
2.11 Universal versus Culture-Specific |
|
|
41 | (2) |
|
2.12 Unimodal versus Multimodal |
|
|
43 | (1) |
|
2.13 All These Taxonomies -- So What? |
|
|
44 | (9) |
|
2.13.1 Emotion Data: The FAU AEC |
|
|
45 | (2) |
|
2.13.2 Non-native Data: The C-AuDiT corpus |
|
|
47 | (1) |
|
|
48 | (5) |
|
|
53 | (26) |
|
3.1 Theories and Models of Personality |
|
|
53 | (2) |
|
3.2 Theories and Models of Emotion and Affect |
|
|
55 | (3) |
|
3.3 Type and Segmentation of Units |
|
|
58 | (2) |
|
3.4 Typical versus Atypical Speech |
|
|
60 | (1) |
|
|
61 | (1) |
|
3.6 Lab versus Life, or Through the Looking Glass |
|
|
62 | (2) |
|
3.7 Sheep and Goats, or Single Instance Decision versus Cumulative Evidence and Overall Performance |
|
|
64 | (1) |
|
3.8 The Few and the Many, or How to Analyse a Hamburger |
|
|
65 | (2) |
|
3.9 Reifications, and What You are Looking for is What You Get |
|
|
67 | (1) |
|
3.10 Magical Numbers versus Sound Reasoning |
|
|
68 | (11) |
|
|
74 | (5) |
|
|
79 | (28) |
|
4.1 The Linguistic Code and Beyond |
|
|
79 | (2) |
|
4.2 The Non-Distinctive Use of Phonetic Elements |
|
|
81 | (10) |
|
4.2.1 Segmental Level: The Case of /r/ Variants |
|
|
81 | (1) |
|
4.2.2 Supra-segmental Level: The Case of Pitch and Fundamental Frequency -- and of Other Prosodic Parameters |
|
|
82 | (4) |
|
4.2.3 In Between: The Case of Other Voice Qualities, Especially Laryngealisation |
|
|
86 | (5) |
|
4.3 The Non-Distinctive Use of Linguistics Elements |
|
|
91 | (5) |
|
4.3.1 Words and Word Classes |
|
|
91 | (3) |
|
4.3.2 Phrase Level: The Case of Filler Phrases and Hedges |
|
|
94 | (2) |
|
|
96 | (2) |
|
4.5 Non-Verbal, Vocal Events |
|
|
98 | (2) |
|
4.6 Common Traits of Formal Aspects |
|
|
100 | (7) |
|
|
101 | (6) |
|
|
107 | (52) |
|
5.1 Biological Trait Primitives |
|
|
109 | (3) |
|
5.1.1 Speaker Characteristics |
|
|
111 | (1) |
|
5.2 Cultural Trait Primitives |
|
|
112 | (3) |
|
5.2.1 Speech Characteristics |
|
|
114 | (1) |
|
|
115 | (4) |
|
|
119 | (4) |
|
5.5 Subjectivity and Sentiment Analysis |
|
|
123 | (1) |
|
|
124 | (7) |
|
5.6.1 Pathological Speech |
|
|
125 | (4) |
|
5.6.2 Temporarily Deviant Speech |
|
|
129 | (1) |
|
|
130 | (1) |
|
|
131 | (4) |
|
5.8 Discrepant Communication |
|
|
135 | (5) |
|
5.8.1 Indirect Speech, Irony, and Sarcasm |
|
|
136 | (2) |
|
|
138 | (1) |
|
|
139 | (1) |
|
5.9 Common Traits of Functional Aspects |
|
|
140 | (19) |
|
|
141 | (18) |
|
|
159 | (20) |
|
|
160 | (4) |
|
6.1.1 Assessment of Annotations |
|
|
161 | (3) |
|
|
164 | (1) |
|
6.2 Corpora and Benchmarks: Some Examples |
|
|
164 | (15) |
|
6.2.1 FAU Aibo Emotion Corpus |
|
|
165 | (1) |
|
|
165 | (1) |
|
|
166 | (2) |
|
6.2.4 Alcohol Language Corpus |
|
|
168 | (1) |
|
6.2.5 Sleepy Language Corpus |
|
|
168 | (1) |
|
6.2.6 Speaker Personality Corpus |
|
|
169 | (1) |
|
6.2.7 Speaker Likability Database |
|
|
170 | (1) |
|
6.2.8 NKI CCRT Speech Corpus |
|
|
171 | (1) |
|
|
171 | (1) |
|
6.2.10 Final Remarks on Databases |
|
|
172 | (1) |
|
|
173 | (6) |
|
|
|
7 Computational Modelling of Paralinguistics: Overview |
|
|
179 | (6) |
|
|
183 | (2) |
|
|
185 | (32) |
|
8.1 Digital Signal Representation |
|
|
185 | (2) |
|
|
187 | (3) |
|
8.3 Acoustic Segmentation |
|
|
190 | (1) |
|
8.4 Continuous Descriptors |
|
|
190 | (27) |
|
|
190 | (1) |
|
|
191 | (1) |
|
|
192 | (2) |
|
8.4.4 Spectrum and Cepstrum |
|
|
194 | (4) |
|
|
198 | (4) |
|
8.4.6 Line Spectral Pairs |
|
|
202 | (1) |
|
8.4.7 Perceptual Linear Prediction |
|
|
203 | (2) |
|
|
205 | (2) |
|
8.4.9 Fundamental Frequency and Voicing Probability |
|
|
207 | (5) |
|
8.4.10 Jitter and Shimmer |
|
|
212 | (2) |
|
8.4.11 Derived Low-Level Descriptors |
|
|
214 | (1) |
|
|
214 | (3) |
|
|
217 | (13) |
|
|
217 | (1) |
|
|
218 | (1) |
|
|
218 | (2) |
|
|
218 | (1) |
|
|
219 | (1) |
|
|
219 | (1) |
|
|
220 | (10) |
|
9.4.1 Vector Space Modelling |
|
|
220 | (2) |
|
|
222 | (5) |
|
|
227 | (3) |
|
10 Supra-segmental Features |
|
|
230 | (5) |
|
|
231 | (1) |
|
10.2 Feature Brute-Forcing |
|
|
232 | (1) |
|
|
233 | (2) |
|
|
234 | (1) |
|
11 Machine-Based Modelling |
|
|
235 | (46) |
|
11.1 Feature Relevance Analysis |
|
|
235 | (3) |
|
|
238 | (26) |
|
11.2.1 Static Classification |
|
|
238 | (18) |
|
11.2.2 Dynamic Classification: Hidden Markov Models |
|
|
256 | (6) |
|
|
262 | (2) |
|
|
264 | (17) |
|
|
264 | (2) |
|
|
266 | (1) |
|
11.3.3 Performance Measures |
|
|
267 | (5) |
|
11.3.4 Result Interpretation |
|
|
272 | (5) |
|
|
277 | (4) |
|
12 System Integration and Application |
|
|
281 | (8) |
|
12.1 Distributed Processing |
|
|
281 | (3) |
|
12.2 Autonomous and Collaborative Learning |
|
|
284 | (2) |
|
|
286 | (3) |
|
|
287 | (2) |
|
13 `Hands-On': Existing Toolkits and Practical Tutorial |
|
|
289 | (15) |
|
|
289 | (1) |
|
|
290 | (4) |
|
13.2.1 Available Feature Extractors |
|
|
293 | (1) |
|
13.3 Practical Computational Paralinguistics How-to |
|
|
294 | (10) |
|
13.3.1 Obtaining and Installing openSMILE |
|
|
295 | (1) |
|
13.3.2 Extracting Features |
|
|
295 | (7) |
|
13.3.3 Classification and Regression |
|
|
302 | (1) |
|
|
303 | (1) |
|
|
304 | (3) |
|
|
307 | (8) |
|
A.1 openSMILE Feature Sets Used at Interspeech Challenges |
|
|
307 | (3) |
|
A.2 Feature Encoding Scheme |
|
|
310 | (5) |
|
|
314 | (1) |
Index |
|
315 | |