Preface |
|
xiii | |
Acronyms |
|
xv | |
List of Symbols |
|
xix | |
1 Introduction |
|
1 | (6) |
|
|
3 | (1) |
|
1.2 A Generalized Audio Content Analysis System |
|
|
4 | (3) |
2 Fundamentals |
|
7 | (24) |
|
|
7 | (7) |
|
|
7 | (2) |
|
|
9 | (1) |
|
2.1.3 Sampling and Quantization |
|
|
9 | (4) |
|
2.1.4 Statistical Signal Description |
|
|
13 | (1) |
|
|
14 | (17) |
|
|
14 | (4) |
|
2.2.2 Block-Based Processing |
|
|
18 | (2) |
|
|
20 | (3) |
|
2.2.4 Constant Q Transform |
|
|
23 | (1) |
|
2.2.5 Auditory Filterbanks |
|
|
24 | (1) |
|
2.2.6 Correlation Function |
|
|
24 | (4) |
|
|
28 | (3) |
3 Instantaneous Features |
|
31 | (40) |
|
|
33 | (2) |
|
|
33 | (1) |
|
|
33 | (1) |
|
|
34 | (1) |
|
|
34 | (1) |
|
3.1.5 Other Pre-Processing Options |
|
|
35 | (1) |
|
3.2 Statistical Properties |
|
|
35 | (6) |
|
|
36 | (1) |
|
|
36 | (1) |
|
|
36 | (1) |
|
|
36 | (1) |
|
|
37 | (1) |
|
3.2.6 Variance and Standard Deviation |
|
|
37 | (1) |
|
|
38 | (1) |
|
|
39 | (1) |
|
3.2.9 Generalized Central Moments |
|
|
40 | (1) |
|
3.2.10 Quantiles and Quantile Ranges |
|
|
40 | (1) |
|
|
41 | (13) |
|
|
42 | (2) |
|
|
44 | (1) |
|
|
45 | (2) |
|
|
47 | (1) |
|
|
48 | (1) |
|
|
49 | (2) |
|
3.3.7 Mel Frequency Cepstral Coefficients |
|
|
51 | (3) |
|
|
54 | (9) |
|
|
54 | (7) |
|
3.4.2 Autocorrelation Coefficients |
|
|
61 | (1) |
|
|
62 | (1) |
|
3.5 Feature Post-Processing |
|
|
63 | (8) |
|
|
64 | (1) |
|
3.5.2 Normalization and Mapping |
|
|
65 | (1) |
|
|
66 | (1) |
|
3.5.4 Feature Dimensionality Reduction |
|
|
66 | (5) |
4 Intensity |
|
71 | (8) |
|
4.1 Human Perception of Intensity and Loudness |
|
|
71 | (2) |
|
4.2 Representation of Dynamics in Music |
|
|
73 | (1) |
|
|
73 | (3) |
|
|
73 | (3) |
|
|
76 | (1) |
|
4.5 Psycho-Acoustic Loudness Features |
|
|
77 | (2) |
|
|
78 | (1) |
5 Tonal Analysis |
|
79 | (40) |
|
5.1 Human Perception of Pitch |
|
|
79 | (3) |
|
|
79 | (2) |
|
|
81 | (1) |
|
5.2 Representation of Pitch in Music |
|
|
82 | (9) |
|
5.2.1 Pitch Classes and Names |
|
|
82 | (1) |
|
|
83 | (1) |
|
5.2.3 Root Note, Mode, and Key |
|
|
83 | (3) |
|
|
86 | (2) |
|
5.2.5 The Frequency of Musical Pitch |
|
|
88 | (3) |
|
5.3 Fundamental Frequency Detection |
|
|
91 | (15) |
|
|
92 | (2) |
|
|
94 | (3) |
|
5.3.3 Monophonic Input Signals |
|
|
97 | (6) |
|
5.3.4 Polyphonic Input Signals |
|
|
103 | (3) |
|
5.4 Tuning Frequency Estimation |
|
|
106 | (2) |
|
|
108 | (8) |
|
|
108 | (4) |
|
|
112 | (4) |
|
|
116 | (3) |
6 Temporal Analysis |
|
119 | (20) |
|
6.1 Human Perception of Temporal Events |
|
|
119 | (4) |
|
|
119 | (3) |
|
|
122 | (1) |
|
|
122 | (1) |
|
|
123 | (1) |
|
6.2 Representation of Temporal Events in Music |
|
|
123 | (1) |
|
6.2.1 Tempo and Time Signature |
|
|
123 | (1) |
|
|
124 | (1) |
|
|
124 | (9) |
|
|
125 | (2) |
|
|
127 | (1) |
|
|
128 | (5) |
|
|
133 | (2) |
|
6.4.1 Beat Histogram Features |
|
|
134 | (1) |
|
6.5 Detection of Tempo and Beat Phase |
|
|
135 | (1) |
|
6.6 Detection of Meter and Downbeat |
|
|
136 | (3) |
7 Alignment |
|
139 | (12) |
|
|
139 | (7) |
|
|
143 | (1) |
|
|
144 | (1) |
|
|
145 | (1) |
|
7.2 Audio-to-Audio Alignment |
|
|
146 | (2) |
|
7.2.1 Ground Truth Data for Evaluation |
|
|
147 | (1) |
|
7.3 Audio-to-Score Alignment |
|
|
148 | (3) |
|
7.3.1 Real-Time Systems M |
|
|
148 | (1) |
|
7.3.2 Non-Real-Time Systems |
|
|
149 | (2) |
8 Musical Genre, Similarity, and Mood |
|
151 | (12) |
|
8.1 Musical Genre Classification |
|
|
151 | (5) |
|
|
152 | (2) |
|
|
154 | (1) |
|
|
155 | (1) |
|
8.2 Related Research Fields |
|
|
156 | (7) |
|
8.2.1 Music Similarity Detection |
|
|
156 | (2) |
|
8.2.2 Mood Classification |
|
|
158 | (3) |
|
8.2.3 Instrument Recognition |
|
|
161 | (2) |
9 Audio Fingerprinting |
|
163 | (6) |
|
9.1 Fingerprint Extraction |
|
|
164 | (1) |
|
|
165 | (1) |
|
9.3 Fingerprinting System: Example |
|
|
166 | (3) |
10 Music Performance Analysis |
|
169 | (12) |
|
10.1 Musical Communication |
|
|
169 | (3) |
|
|
169 | (1) |
|
|
170 | (2) |
|
|
172 | (1) |
|
|
172 | (1) |
|
10.2 Music Performance Analysis |
|
|
172 | (9) |
|
|
173 | (4) |
|
|
177 | (4) |
A Convolution Properties |
|
181 | (4) |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
182 | (1) |
|
|
183 | (1) |
|
|
183 | (2) |
B Fourier Transform |
|
185 | (14) |
|
B.1 Properties of the Fourier Transformation |
|
|
186 | (4) |
|
B.1.1 Inverse Fourier Transform |
|
|
186 | (1) |
|
|
186 | (1) |
|
B.1.3 Convolution and Multiplication |
|
|
186 | (1) |
|
|
187 | (1) |
|
B.1.5 Time and Frequency Shift |
|
|
188 | (1) |
|
|
188 | (1) |
|
B.1.7 Time and Frequency Scaling |
|
|
189 | (1) |
|
|
190 | (1) |
|
B.2 Spectrum of Example Time Domain Signals |
|
|
190 | (2) |
|
|
190 | (1) |
|
|
191 | (1) |
|
|
191 | (1) |
|
|
191 | (1) |
|
|
191 | (1) |
|
B.3 Transformation of Sampled Time Signals |
|
|
192 | (1) |
|
B.4 Short Time Fourier Transform of Continuous Signals |
|
|
192 | (3) |
|
|
193 | (2) |
|
B.5 Discrete Fourier Transform |
|
|
195 | (4) |
|
|
196 | (1) |
|
B.5.2 Fast Fourier Transform |
|
|
197 | (2) |
C Principal Component Analysis |
|
199 | (2) |
|
C.1 Computation of the Transformation Matrix |
|
|
200 | (1) |
|
C.2 Interpretation of the Transformation Matrix |
|
|
200 | (1) |
D Software for Audio Analysis |
|
201 | (6) |
|
D.1 Software Frameworks and Applications |
|
|
202 | (2) |
|
|
202 | (1) |
|
|
202 | (1) |
|
|
203 | (1) |
|
|
203 | (1) |
|
|
203 | (1) |
|
D.2 Software Libraries and Toolboxes |
|
|
204 | (3) |
|
|
204 | (1) |
|
|
205 | (1) |
|
|
206 | (1) |
References |
|
207 | (36) |
Index |
|
243 | |