Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing is an up-to-date overview of audio and video content analysis. Included is extensive treatment of audiovisual data segmentation, indexing and retrieval based on multimodal media content analysis, and content-based management of audio data. In addition to the commonly studied audio types such as speech and music, the authors have included hybrid types of sounds that contain more than one kind of audio component such as speech or environmental sound with music in the background. Emphasis is also placed on semantic-level identification and classification of environmental sounds. The authors introduce a new generic audio retrieval system on top of the audio archiving schemes. Both theoretical analysis and implementation issues are presented. The developing MPEG-7 standards are explored.
Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing will be especially useful to researchers and graduate level students designing and developing fully functional audiovisual systems for audio/video content parsing of multimedia streams.
With Moving Pictures Experts Group (MPEG) technologies, multi- media has become hotter than ever. While previous research on the automatic segmentation, indexing, and retrieval of audiovisual data has focused primarily on the pictorial part, it is becoming more recognized that a fully functioning system for video content parsing requires a proper mix of audio as well as visual information. Ergo, the authors devote much of this monograph to the content-based management of audio data, based on a three-stage hierarchical system. Some of the experimental results and illustrations featured are derived from the video portion of MPEG-7 test data. Zhang and Kuo are with the Integrated Media Systems Center and department of electrical engineering systems at the U. of Southern California, Los Angeles. Annotation c. Book News, Inc., Portland, OR (booknews.com)
Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing is an up-to-date overview of audio and video content analysis. Included is extensive treatment of audiovisual data segmentation, indexing and retrieval based on multimodal media content analysis, and content-based management of audio data. In addition to the commonly studied audio types such as speech and music, the authors have included hybrid types of sounds that contain more than one kind of audio component such as speech or environmental sound with music in the background. Emphasis is also placed on semantic-level identification and classification of environmental sounds. The authors introduce a new generic audio retrieval system on top of the audio archiving schemes. Both theoretical analysis and implementation issues are presented. The developing MPEG-7 standards are explored.
Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing will be especially useful to researchers and graduate level students designing and developing fully functional audiovisual systems for audio/video content parsing of multimedia streams.