Muutke küpsiste eelistusi

E-raamat: Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing

(Technische Universität München), (University of Erlangen-Nuremberg)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 05-Sep-2013
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118706633
  • Formaat - PDF+DRM
  • Hind: 117,26 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Raamatukogudele
  • Formaat: PDF+DRM
  • Ilmumisaeg: 05-Sep-2013
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9781118706633

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics (paralinguistics) expressed by or embedded in human speech and language.

It is the first book to provide such a systematic survey of paralinguistics in speech and language processing. The technology described has evolved mainly from automatic speech and speaker recognition and processing, but also takes into account recent developments within speech signal processing, machine intelligence and data mining.

Moreover, the book offers a hands-on approach by integrating actual data sets, software, and open-source utilities which will make the book invaluable as a teaching tool and similarly useful for those professionals already in the field.

Key features:





Provides an integrated presentation of basic research (in phonetics/linguistics and humanities) with state-of-the-art engineering approaches for speech signal processing and machine intelligence. Explains the history and state of the art of all of the sub-fields which contribute to the topic of computational paralinguistics. C overs the signal processing and machine learning aspects of the actual computational modelling of emotion and personality and explains the detection process from corpus collection to feature extraction and from model testing to system integration. Details aspects of real-world system integration including distribution, weakly supervised learning and confidence measures. Outlines machine learning approaches including static, dynamic and contextsensitive algorithms for classification and regression. Includes a tutorial on freely available toolkits, such as the open-source openEAR toolkit for emotion and affect recognition co-developed by one of the authors, and a listing of standard databases and feature sets used in the field to allow for immediate experimentation enabling the reader to build an emotion detection model on an existing corpus.
Preface xiii
Acknowledgements xv
List of Abbreviations
xvii
PART I Foundations
1 Introduction
3(18)
1.1 What is Computational Paralinguistics? A First Approximation
3(4)
1.2 History and Subject Area
7(3)
1.3 Form versus Function
10(2)
1.4 Further Aspects
12(5)
1.4.1 The Synthesis of Emotion and Personality
12(1)
1.4.2 Multimodality: Analysis and Generation
13(2)
1.4.3 Applications, Usability and Ethics
15(2)
1.5 Summary and Structure of the Book
17(4)
References
18(3)
2 Taxonomies
21(32)
2.1 Traits versus States
21(4)
2.2 Acted versus Spontaneous
25(5)
2.3 Complex versus Simple
30(1)
2.4 Measured versus Assessed
31(2)
2.5 Categorical versus Continuous
33(2)
2.6 Felt versus Perceived
35(2)
2.7 Intentional versus Instinctual
37(1)
2.8 Consistent versus Discrepant
38(1)
2.9 Private versus Social
39(1)
2.10 Prototypical versus Peripheral
40(1)
2.11 Universal versus Culture-Specific
41(2)
2.12 Unimodal versus Multimodal
43(1)
2.13 All These Taxonomies -- So What?
44(9)
2.13.1 Emotion Data: The FAU AEC
45(2)
2.13.2 Non-native Data: The C-AuDiT corpus
47(1)
References
48(5)
3 Aspects of Modelling
53(26)
3.1 Theories and Models of Personality
53(2)
3.2 Theories and Models of Emotion and Affect
55(3)
3.3 Type and Segmentation of Units
58(2)
3.4 Typical versus Atypical Speech
60(1)
3.5 Context
61(1)
3.6 Lab versus Life, or Through the Looking Glass
62(2)
3.7 Sheep and Goats, or Single Instance Decision versus Cumulative Evidence and Overall Performance
64(1)
3.8 The Few and the Many, or How to Analyse a Hamburger
65(2)
3.9 Reifications, and What You are Looking for is What You Get
67(1)
3.10 Magical Numbers versus Sound Reasoning
68(11)
References
74(5)
4 Formal Aspects
79(28)
4.1 The Linguistic Code and Beyond
79(2)
4.2 The Non-Distinctive Use of Phonetic Elements
81(10)
4.2.1 Segmental Level: The Case of /r/ Variants
81(1)
4.2.2 Supra-segmental Level: The Case of Pitch and Fundamental Frequency -- and of Other Prosodic Parameters
82(4)
4.2.3 In Between: The Case of Other Voice Qualities, Especially Laryngealisation
86(5)
4.3 The Non-Distinctive Use of Linguistics Elements
91(5)
4.3.1 Words and Word Classes
91(3)
4.3.2 Phrase Level: The Case of Filler Phrases and Hedges
94(2)
4.4 Disfluencies
96(2)
4.5 Non-Verbal, Vocal Events
98(2)
4.6 Common Traits of Formal Aspects
100(7)
References
101(6)
5 Functional Aspects
107(52)
5.1 Biological Trait Primitives
109(3)
5.1.1 Speaker Characteristics
111(1)
5.2 Cultural Trait Primitives
112(3)
5.2.1 Speech Characteristics
114(1)
5.3 Personality
115(4)
5.4 Emotion and Affect
119(4)
5.5 Subjectivity and Sentiment Analysis
123(1)
5.6 Deviant Speech
124(7)
5.6.1 Pathological Speech
125(4)
5.6.2 Temporarily Deviant Speech
129(1)
5.6.3 Non-native Speech
130(1)
5.7 Social Signals
131(4)
5.8 Discrepant Communication
135(5)
5.8.1 Indirect Speech, Irony, and Sarcasm
136(2)
5.8.2 Deceptive Speech
138(1)
5.8.3 Off-Talk
139(1)
5.9 Common Traits of Functional Aspects
140(19)
References
141(18)
6 Corpus Engineering
159(20)
6.1 Annotation
160(4)
6.1.1 Assessment of Annotations
161(3)
6.1.2 New Trends
164(1)
6.2 Corpora and Benchmarks: Some Examples
164(15)
6.2.1 FAU Aibo Emotion Corpus
165(1)
6.2.2 Gender Corpus
165(1)
6.2.3 TUM AVIC Corpus
166(2)
6.2.4 Alcohol Language Corpus
168(1)
6.2.5 Sleepy Language Corpus
168(1)
6.2.6 Speaker Personality Corpus
169(1)
6.2.7 Speaker Likability Database
170(1)
6.2.8 NKI CCRT Speech Corpus
171(1)
6.2.9 TIMIT Database
171(1)
6.2.10 Final Remarks on Databases
172(1)
References
173(6)
PART II Modelling
7 Computational Modelling of Paralinguistics: Overview
179(6)
References
183(2)
8 Acoustic Features
185(32)
8.1 Digital Signal Representation
185(2)
8.2 Short Time Analysis
187(3)
8.3 Acoustic Segmentation
190(1)
8.4 Continuous Descriptors
190(27)
8.4.1 Intensity
190(1)
8.4.2 Zero Crossings
191(1)
8.4.3 Autocorrelation
192(2)
8.4.4 Spectrum and Cepstrum
194(4)
8.4.5 Linear Prediction
198(4)
8.4.6 Line Spectral Pairs
202(1)
8.4.7 Perceptual Linear Prediction
203(2)
8.4.8 Formants
205(2)
8.4.9 Fundamental Frequency and Voicing Probability
207(5)
8.4.10 Jitter and Shimmer
212(2)
8.4.11 Derived Low-Level Descriptors
214(1)
References
214(3)
9 Linguistic Features
217(13)
9.1 Textual Descriptors
217(1)
9.2 Preprocessing
218(1)
9.3 Reduction
218(2)
9.3.1 Stopping
218(1)
9.3.2 Stemming
219(1)
9.3.3 Tagging
219(1)
9.4 Modelling
220(10)
9.4.1 Vector Space Modelling
220(2)
9.4.2 On-line Knowledge
222(5)
References
227(3)
10 Supra-segmental Features
230(5)
10.1 Functionals
231(1)
10.2 Feature Brute-Forcing
232(1)
10.3 Feature Stacking
233(2)
References
234(1)
11 Machine-Based Modelling
235(46)
11.1 Feature Relevance Analysis
235(3)
11.2 Machine Learning
238(26)
11.2.1 Static Classification
238(18)
11.2.2 Dynamic Classification: Hidden Markov Models
256(6)
11.2.3 Regression
262(2)
11.3 Testing Protocols
264(17)
11.3.1 Partitioning
264(2)
11.3.2 Balancing
266(1)
11.3.3 Performance Measures
267(5)
11.3.4 Result Interpretation
272(5)
References
277(4)
12 System Integration and Application
281(8)
12.1 Distributed Processing
281(3)
12.2 Autonomous and Collaborative Learning
284(2)
12.3 Confidence Measures
286(3)
References
287(2)
13 `Hands-On': Existing Toolkits and Practical Tutorial
289(15)
13.1 Related Toolkits
289(1)
13.2 openSMILE
290(4)
13.2.1 Available Feature Extractors
293(1)
13.3 Practical Computational Paralinguistics How-to
294(10)
13.3.1 Obtaining and Installing openSMILE
295(1)
13.3.2 Extracting Features
295(7)
13.3.3 Classification and Regression
302(1)
References
303(1)
14 Epilogue
304(3)
Appendix
307(8)
A.1 openSMILE Feature Sets Used at Interspeech Challenges
307(3)
A.2 Feature Encoding Scheme
310(5)
References
314(1)
Index 315
Björn Schuller, Technische Universität München, Germany

Anton Batliner, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany