Muutke küpsiste eelistusi

E-raamat: MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

(Technical University of Berlin, Germany), (Technical University of Berlin, Germany), (Technical University of Berlin, Germany)
  • Formaat: PDF+DRM
  • Ilmumisaeg: 03-Feb-2006
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9780470093351
  • Formaat - PDF+DRM
  • Hind: 138,26 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Lisa ostukorvi
  • Lisa soovinimekirja
  • See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
  • Raamatukogudele
  • Formaat: PDF+DRM
  • Ilmumisaeg: 03-Feb-2006
  • Kirjastus: John Wiley & Sons Inc
  • Keel: eng
  • ISBN-13: 9780470093351

DRM piirangud

  • Kopeerimine (copy/paste):

    ei ole lubatud

  • Printimine:

    ei ole lubatud

  • Kasutamine:

    Digitaalõiguste kaitse (DRM)
    Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale  Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

    Vajalik tarkvara
    Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

    PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

    Seda e-raamatut ei saa lugeda Amazon Kindle's. 

Kim (Samsung Advanced Institute of Technology, Korea) et al. offer strategies and algorithms for automatic extraction and description of audio including speech, music, and other signals. They explain techniques for analysis, description, and classification of digital audio waveforms--MPEG-7 techniques as well as others in comparison to them. Coverage encompasses low-level descriptors, sound classification and similarity, spoken content, music description tools, fingerprinting and audio signal quality, and example applications. The book is aimed at electronics and communications engineers and researchers and graduate students in those fields. Annotation ©2006 Book News, Inc., Portland, OR (booknews.com)

Advances in technology, such as MP3 players, the Internet and DVDs, have led to the production, storage and distribution of a wealth of audio signals, including speech, music and more general sound signals and their combinations. MPEG-7 audio tools were created to enable the navigation of this data, by providing an established framework for effective multimedia management. MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval is a unique insight into the technology, covering the following topics:
  • the fundamentals of MPEG-7 audio, principally low-level descriptors and sound classification and similarity;
  • spoken content description, and timbre, melody and tempo music description tools;
  • existing MPEG-7 applications and those currently being developed;
  • examples of audio technology beyond the scope of MPEG-7.

Essential reading for practising electronic and communications engineers designing and implementing MPEG-7 compliant systems, this book will also be a useful reference for researchers and graduate students working with multimedia database technology.

List of Acronyms
xi
List of Symbols
xv
Introduction
1(12)
Audio Content Description
2(1)
MPEG-7 Audio Content Description -- An Overview
3(7)
MPEG-7 Low-Level Descriptors
5(1)
MPEG-7 Description Schemes
6(3)
MPEG-7 Description Definition Language (DDL)
9(1)
BiM (Binary Format for MPEG-7)
9(1)
Organization of the Book
10(3)
Low-Level Descriptors
13(46)
Introduction
13(1)
Basic Parameters and Notations
14(3)
Time Domain
14(1)
Frequency Domain
15(2)
Scalable Series
17(5)
Series of Scalars
18(2)
Series of Vectors
20(2)
Binary Series
22(1)
Basic Descriptors
22(2)
Audio Waveform
23(1)
Audio Power
24(1)
Basic Spectral Descriptors
24(8)
Audio Spectrum Envelope
24(3)
Audio Spectrum Centroid
27(2)
Audio Spectrum Spread
29(1)
Audio Spectrum Flatness
29(3)
Basic Signal Parameters
32(6)
Audio Harmonicity
33(3)
Audio Fundamental Frequency
36(2)
Timbral Descriptors
38(11)
Temporal Timbral: Requirements
39(1)
Log Attack Time
40(1)
Temporal Centroid
41(1)
Spectral Timbral: Requirements
42(3)
Harmonic Spectral Centroid
45(2)
Harmonic Spectral Deviation
47(1)
Harmohic Spectral Spread
47(1)
Harmonic Spectral Variation
48(1)
Spectral Centroid
48(1)
Spectral Basis Representations
49(1)
Silence Segment
50(1)
Beyond the Scope of MPEG-7
50(9)
Other Low-Level Descriptors
50(2)
Mel-Frequency Cepstrum Coefficients
52(3)
References
55(4)
Sound Classification and Similarity
59(44)
Introduction
59(2)
Dimensionality Reduction
61(5)
Singular Value Decomposition (SVD)
61(1)
Principal Component Analysis (PCA)
62(1)
Independent Component Analysis (ICA)
63(2)
Non-Negative Factorization (NMF)
65(1)
Classification Methods
66(7)
Gaussian Mixture Model (GMM)
66(2)
Hidden Markov Model (HMM)
68(2)
Neural Network (NN)
70(1)
Support Vector Machine (SVM)
71(2)
MPEG-7 Sound Classification
73(6)
MPEG-7 Audio Spectrum Projection (ASP) Feature Extraction
74(3)
Training Hidden Markov Models (HMMs)
77(2)
Classification of Sounds
79(1)
Comparison of MPEG-7 Audio Spectrum Projection vs. MFCC Features
79(5)
Indexing and Similarity
84(1)
Audio Retrieval Using Histogram Sum of Squared Differences
85(1)
Simulation Results and Discussion
85(15)
Plots of MPEG-7 Audio Descriptors
86(2)
Parameter Selection
88(3)
Results for Distinguishing Between Speech, Music and Environmental Sound
91(1)
Results of Sound Classification Using Three Audio Taxonomy Methods
92(4)
Results for Speaker Recognition
96(2)
Results of Musical Instrument Classification
98(1)
Audio Retrieval Results
99(1)
Conclusions
100(3)
References
101(2)
Spoken Content
103(68)
Introduction
103(1)
Automatic Speech Recognition
104(9)
Basic Principles
104(4)
Types of Speech Recognition Systems
108(3)
Recognition Results
111(2)
MPEG-7 Spoken Content Description
113(10)
General Structure
114(1)
SpokenContentHeader
114(7)
SpokenContentLattice
121(2)
Application: Spoken Document Retrieval
123(40)
Basic Principles of IR and SDR
124(6)
Vector Space Models
130(5)
Word-Based SDR
135(5)
Sub-Word-Based Vector Space Models
140(14)
Sub-Word String Matching
154(7)
Combining Word and Sub-Word Indexing
161(2)
Conclusions
163(8)
MPEG-7 Interoperability
163(1)
MPEG-7 Flexibility
164(2)
Perspectives
166(1)
References
167(4)
Music Description Tools
171(36)
Timbre
171(6)
Introduction
171(2)
Instrument Timbre
173(1)
Harmonic InstrumentTimbre
174(2)
PercussiveInstrumentTimbre
176(1)
Distance Measures
176(1)
Melody
177(13)
Melody
177(1)
Meter
178(1)
Scale
179(2)
Key
181(1)
MelodyContour
182(3)
MelodySequence
185(5)
Tempo
190(3)
AudioTempo
192(1)
AudioBPM
192(1)
Application Example: Query-by-Humming
193(14)
Monophonic Melody Transcription
194(2)
Polyphonic Melody Transcription
196(4)
Comparison of Melody Contours
200(3)
References
203(4)
Fingerprinting and Audio Signal Quality
207(24)
Introduction
207(1)
Audio Signature
207(13)
Generalities on Audio Fingerprinting
207(4)
Fingerprint Extraction
211(5)
Distance and Searching Methods
216(1)
MPEG-7-Standardized AudioSignature
217(3)
Audio Signal Quality
220(11)
AudioSignalQuality Description Scheme
221(1)
BroadcastReady
222(1)
IsOriginalMono
222(1)
BackgroundNoiseLevel
222(1)
CrossChannelCorrelation
223(1)
RelativeDelay
224(1)
Balance
224(1)
DcOffset
225(1)
Bandwidth
226(1)
TransmissionTechnology
226(1)
ErrorEvent and ErrorEventList
226(1)
References
227(4)
Application
231(40)
Introduction
231(3)
Automatic Audio Segmentation
234(20)
Feature Extraction
235(1)
Segmentation
236(1)
Metric-Based Segmentation
237(5)
Model-Selection-Based Segmentation
242(1)
Hybrid Segmentation
243(3)
Hybrid Segmentation Using MPEG-7 ASP
246(4)
Segmentation Results
250(4)
Sound Indexing and Browsing of Home Video Using Spoken Annotations
254(5)
A Simple Experimental System
254(4)
Retrieval Results
258(1)
Highlights Extraction for Sport Programmes Using Audio Event Detection
259(6)
Goal Event Segment Selection
261(1)
System Results
262(3)
A Spoken Document Retrieval System for Digital Photo Albums
265(6)
References
266(5)
Index 271


Hyoung-Gook Kim, Researcher of the MPEG-7 Audio Project at the Communication Systems Group, Technical University of Berlin, Communication Systems Group, Sekr. EN 1, Einsteinufer 17, D-10587 Berlin Nicolas Moreau, Researcher of the MPEG-7 Audio Project at the Communication Systems Group, Technical University of Berlin, Communication Systems Group, Sekr. EN 1, Einsteinufer 17, D-10587 Berlin

Thomas Sikora, Professor and head of the Communication Systems Group, Technical University of Berlin, Communication Systems Group, Sekr. EN 1, Einsteinufer 17, D-10587 Berlin