Klienditugi: 7440010 (E-R 10-18)

Abi | Registreeri | Logi sisse

E-raamat: Parametric Time-Frequency Domain Spatial Audio

Edited by Symeon Delikaris-Manias, Edited by Archontis Politis, Edited by Ville Pulkki

Formaat: PDF+DRM
Sari: IEEE Press
Ilmumisaeg: 04-Oct-2017
Kirjastus: Wiley-IEEE Press
Keel: eng
ISBN-13: 9781119252580

Teised raamatud teemal:

Acoustic & sound engineering

Formaat - PDF+DRM
Hind: 120,97 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.
Raamatukogudele

Formaat: PDF+DRM
Sari: IEEE Press
Ilmumisaeg: 04-Oct-2017
Kirjastus: Wiley-IEEE Press
Keel: eng
ISBN-13: 9781119252580

Teised raamatud teemal:

Acoustic & sound engineering

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

A comprehensive guide that addresses the theory and practice of spatial audio

This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in.

Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming—covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed for such processing, and provides an overview to existing research. It also shows recent up-to-date projects and commercial applications built on top of the systems.

Provides an in-depth presentation of the principles, past developments, state-of-the-art methods, and future research directions of spatial audio technologies
Includes contributions from leading researchers in the field
Offers MATLAB codes with selected chapters

An advanced book aimed at readers who are capable of digesting mathematical expressions about digital signal processing and sound field analysis, Parametric Time-frequency Domain Spatial Audio is best suited for researchers in academia and in the audio industry.

List of Contributors

xiii

Preface

About the Companion Website

xix

Part I Analysis and Synthesis of Spatial Sound

(68)

1 Time-Frequency Processing: Methods and Tools

(22)

Juha Vilkamo

Tom Backstrom

1.1 Introduction

(1)

1.2 Time-Frequency Processing

(12)

1.2.1 Basic Structure

(1)

1.2.2 Uniform Filter Banks

(1)

1.2.3 Prototype Filters and Modulation

(2)

1.2.4 A Robust Complex-Modulated Filter Bank, and Comparison with STFT

(4)

1.2.5 Overlap-Add and Windowing

(1)

1.2.6 Example Implementation of a Robust Filter Bank in Matlab

(2)

1.2.7 Cascaded Filters

(1)

1.3 Processing of Spatial Audio

(9)

1.3.1 Stochastic Estimates

(1)

1.3.2 Decorrelation

(1)

1.3.3 Optimal and Generalized Solution for Spatial Sound Processing Using Covariance Matrices

(4)

References

(2)

2 Spatial Decomposition by Spherical Array Processing

(24)

David Lou Alon

Boaz Rafaely

2.1 Introduction

(1)

2.2 Sound Field Measurement by a Spherical Array

(1)

2.3 Array Processing and Plane-Wave Decomposition

(3)

2.4 Sensitivity to Noise and Standard Regularization Methods

(3)

2.5 Optimal Noise-Robust Design

(5)

2.5.1 PWD Estimation Error Measure

(2)

2.5.2 PWD Error Minimization

(1)

2.5.3 R-PWD Simulation Study

(2)

2.6 Spatial Aliasing and High Frequency Performance Limit

(2)

2.7 High Frequency Bandwidth Extension by Aliasing Cancellation

(3)

2.7.1 Spatial Aliasing Error

(1)

2.7.2 AC-PWD Simulation Study

(2)

2.8 High Performance Broadband PWD Example

(3)

2.8.1 Broadband Measurement Model

(1)

2.8.2 Minimizing Broadband PWD Error

(2)

2.8.3 BB-PWD Simulation Study

(1)

2.9 Summary

(1)

2.10 Acknowledgment

(3)

References

(3)

3 Sound Field Analysis Using Sparse Recovery

(20)

Craig T. Jin

Nicolas Epain

Tahereh Noohi

3.1 Introduction

(1)

3.2 The Plane-Wave Decomposition Problem

(1)

3.2.1 Sparse Plane-Wave Decomposition

(1)

3.2.2 The Iteratively Reweighted Least-Squares Algorithm

(1)

3.3 Bayesian Approach to Plane-Wave Decomposition

(2)

3.4 Calculating the IRLS Noise-Power Regularization Parameter

(3)

3.4.1 Estimation of the Relative Noise Power

(2)

3.5 Numerical Simulations

(1)

3.6 Experiment: Echoic Sound Scene Analysis

(6)

3.7 Conclusions

(4)

Appendix

(1)

References

(3)

Part II Reproduction of Spatial Sound

(183)

4 Overview of Time--Frequency Domain Parametric Spatial Audio Techniques

(18)

Archontis Politis

Symeon Delikaris-Manias

Ville Pulkki

4.1 Introduction

(2)

4.2 Parametric Processing Overview

(16)

4.2.1 Analysis Principles

(1)

4.2.2 Synthesis Principles

(1)

4.2.3 Spatial Audio Coding and Up-Mixing

(2)

4.2.4 Spatial Sound Recording and Reproduction

(3)

4.2.5 Auralization of Measured Room Acoustics and Spatial Rendering of Room Impulse Responses

(1)

References

(7)

5 First-Order Directional Audio Coding (DirAC)

(52)

Ville Pulkki

Archontis Politis

Mikko-Ville Laitinen

Juha Vilkamo

Jukka Ahonen

5.1 Representing Spatial Sound with First-Order B-Format Signals

(3)

5.2 Some Notes on the Evolution of the Technique

(2)

5.3 DirAC with Ideal B-Format Signals

(3)

5.4 Analysis of Directional Parameters with Real Microphone Setups

(8)

5.4.1 DOA Analysis with Open 2D Microphone Arrays

(2)

5.4.2 DOA Analysis with 2D Arrays with a Rigid Baffle

(2)

5.4.3 DOA Analysis in Underdetermined Cases

101

(1)

5.4.4 DOA Analysis: Further Methods

102

(1)

5.4.5 Effect of Spatial Aliasing and Microphone Noise on the Analysis of Diffuseness

103

(2)

5.5 First-Order DirAC with Monophonic Audio Transmission

105

(1)

5.6 First-Order DirAC with Multichannel Audio Transmission

106

(11)

5.6.1 Stream-Based Virtual Microphone Rendering

106

(3)

5.6.2 Evaluation of Virtual Microphone DirAC

109

(2)

5.6.3 Discussion of Virtual Microphone DirAC

111

(1)

5.6.4 Optimized DirAC Synthesis

111

(3)

5.6.5 DirAC-Based Reproduction of Spaced-Array Recordings

114

(3)

5.7 DirAC Synthesis for Headphones and for Hearing Aids

117

(2)

5.7.1 Reproduction of B-Format Signals

117

(1)

5.7.2 DirAC in Hearing Aids

118

(1)

5.8 Optimizing the Time--Frequency Resolution of DirAC for Critical Signals

119

(1)

5.9 Example Implementation

120

(16)

5.9.1 Executing DirAC and Plotting Parameter History

122

(3)

5.9.2 DirAC Initialization

125

(6)

5.9.3 DirAC Runtime

131

(5)

5.9 A Simplistic Binaural Synthesis of Loudspeaker Listening

136

(1)

5.10 Summary

137

(4)

References

138

(3)

6 Higher-Order Directional Audio Coding

141

(20)

Archontis Politis

Ville Pulkki

6.1 Introduction

141

(3)

6.2 Sound Field Model

144

(1)

6.3 Energetic Analysis and Estimation of Parameters

145

(6)

6.3.1 Analysis of Intensity and Diffuseness in the Spherical Harmonic Domain

146

(1)

6.3.2 Higher-Order Energetic Analysis

147

(2)

6.3.3 Sector Profiles

149

(2)

6.4 Synthesis of Target Setup Signals

151

(606)

6.4.1 Loudspeaker Rendering

152

(3)

6.4.2 Binaural Rendering

155

(2)

6.5 Subjective Evaluation

157

(1)

6.6 Conclusions

157

(4)

References

158

(3)

7 Multi-Channel Sound Acquisition Using a Multi-Wave Sound Field Model

161

(40)

Oliver Thiergart

Emanuel Habets

7.1 Introduction

161

(2)

7.2 Parametric Sound Acquisition and Processing

163

(1)

7.2.1 Problem Formulation

163

(3)

7.2.2 Principal Estimation of the Target Signal

166

(1)

7.3 Multi-Wave Sound Field and Signal Model

167

(3)

7.3.1 Direct Sound Model

168

(1)

7.3.2 Diffuse Sound Model

169

(1)

7.3.3 Noise Model

169

(1)

7.4 Direct and Diffuse Signal Estimation

170

(9)

7.4.1 Estimation of the Direct Signal Ys(k,n)

170

(6)

7.4.2 Estimation of the Diffuse Signal Yd(k,n)

176

(3)

7.5 Parameter Estimation

179

(7)

7.5.1 Estimation of the Number of Sources

179

(2)

7.5.2 Direction of Arrival Estimation

181

(1)

7.5.3 Microphone Input PSD Matrix

181

(1)

7.5.4 Noise PSD Estimation

182

(1)

7.5.5 Diffuse Sound PSD Estimation

182

(3)

7.5.6 Signal PSD Estimation in Multi-Wave Scenarios

185

(1)

7.6 Application to Spatial Sound Reproduction

186

(8)

7.6.1 State of the Art

186

(1)

7.6.2 Spatial Sound Reproduction Based on Informed Spatial Filtering

187

(7)

7.7 Summary

194

(7)

References

195

(6)

8 Adaptive Mixing of Excessively Directive and Robust Beamformers for Reproduction of Spatial Sound

201

(14)

Symeon Delikaris-Manias

Juha Vilkamo

8.1 Introduction

201

(1)

8.2 Notation and Signal Model

202

(1)

8.3 Overview of the Method

203

(1)

8.4 Loudspeaker-Based Spatial Sound Reproduction

204

(5)

8.4.1 Estimation of the Target Covariance Matrix Cy

204

(2)

8.4.2 Estimation of the Synthesis Beamforming Signals Ws

206

(1)

8.4.3 Processing the Synthesis Signals (Wsx) to Obtain the Target Covariance Matrix Cy

206

(1)

8.4.4 Spatial Energy Distribution

207

(1)

8.4.5 Listening Tests

208

(1)

8.5 Binaural-Based Spatial Sound Reproduction

209

(3)

8.5.1 Estimation of the Analysis and Synthesis Beamforming Weight Matrices

210

(1)

8.5.2 Diffuse-Field Equalization of HRTFs

210

(1)

8.5.3 Adaptive Mixing and Decorrelation

211

(1)

8.5.4 Subjective Evaluation

211

(1)

8.6 Conclusions

212

(3)

References

212

(3)

9 Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization

215

(37)

Joonas Nikunen

Tuomas Virtanen

9.1 Introduction

215

(2)

9.2 Spectrogram Factorization

217

(9)

9.2.1 Mixtures of Sounds

217

(1)

9.2.2 Magnitude Spectrogram Models

218

(3)

9.2.3 Complex-Valued Spectrogram Models

221

(4)

9.2.4 Source Separation by Time-Frequency Filtering

225

(1)

9.3 Array Signal Processing and Spectrogram Factorization

226

(5)

9.3.1 Spaced Microphone Arrays

226

(1)

9.3.2 Model for Spatial Covariance Based on Direction of Arrival

227

(2)

9.3.3 Complex-Valued NMF with the Spatial Covariance Model

229

(2)

9.4 Applications of Spectrogram Factorization in Spatial Audio

231

(12)

9.4.1 Parameterization of Surround Sound: Upmixing by Time-Frequency Filtering

231

(2)

9.4.2 Source Separation Using a Compact Microphone Array

233

(5)

9.4.3 Reconstruction of Binaural Sound Through Source Separation

238

(5)

9.5 Discussion

243

(1)

9.6 Matlab Example

243

(9)

References

247

(5)

Part III Signal-Dependent Spatial Filtering

252

(75)

10 Time-Frequency Domain Spatial Audio Enhancement

253

(12)

Symeon Delikaris-Manias

Pasi Pertila

10.1 Introduction

253

(1)

10.2 Signal-Independent Enhancement

254

(1)

10.3 Signal-Dependent Enhancement

255

(10)

10.3.1 Adaptive Beamformers

255

(2)

10.3.2 Post-Filters

257

(1)

10.3.3 Post-Filter Types

257

(2)

10.3.4 Estimating Post-Filters with Machine Learning

259

(1)

10.3.5 Post-Filter Design Based on Spatial Parameters

259

(2)

References

261

(4)

11 Cross-Spectrum-Based Post-Filter Utilizing Noisy and Robust Beamformers

265

(26)

Symeon Delikaris-Manias

Ville Pulkki

11.1 Introduction

265

(2)

11.2 Notation and Signal Model

267

(2)

11.2.1 Virtual Microphone Design Utilizing Pressure Microphones

268

(1)

11.3 Estimation of the Cross-Spectrum-Based Post-Filter

269

(10)

11.3.1 Post-Filter Estimation Utilizing Two Static Beamformers

270

(2)

11.3.2 Post-Filter Estimation Utilizing a Static and an Adaptive Beamformer

272

(5)

11.3.3 Smoothing Techniques

277

(2)

11.4 Implementation Examples

279

(4)

11.4.1 Ideal Conditions

279

(2)

11.4.2 Prototype Microphone Arrays

281

(2)

11.5 Conclusions and Further Remarks

283

(1)

11.6 Source Code

284

(7)

References

287

(4)

12 Microphone-Array-Based Speech Enhancement Using Neural Networks

291

(36)

Pasi Pertita

12.1 Introduction

291

(2)

12.2 Time--Frequency Masks for Speech Enhancement Using Supervised Learning

293

(5)

12.2.1 Beamforming with Post-Filtering

293

(1)

12.2.2 Overview of Mask Prediction

294

(1)

12.2.3 Features for Mask Learning

295

(2)

12.2.4 Target Mask Design

297

(1)

12.3 Artificial Neural Networks

298

(7)

12.3.1 Learning the Weights

299

(2)

12.3.2 Generalization

301

(4)

12.3.3 Deep Neural Networks

305

(1)

12.4 Mask Learning: A Simulated Example

305

(5)

12.4.1 Feature Extraction

306

(1)

12.4.2 Target Mask Design

306

(1)

12.4.3 Neural Network Training

307

(1)

12.4.4 Results

308

(2)

12.5 Mask Learning: A Real-World Example

310

(8)

12.5.1 Brief Description of the Third CHiME Challenge Data

310

(2)

12.5.2 Data Processing and Beamforming

312

(1)

12.5.3 Description of Network Structure, Features, and Targets

312

(2)

12.5.4 Mask Prediction Results and Discussion

314

(2)

12.5.5 Speech Enhancement Results

316

(2)

12.6 Conclusions

318

(1)

12.7 Source Code

318

(9)

12.7.1 Matlab Code for Neural-Network-Based Sawtooth Denoising Example

318

(3)

12.1.2 Matlab Code for Phase Feature Extraction

321

(3)

References

324

(3)

Part IV Applications

327

(60)

13 Upmixing and Beamforming in Professional Audio

329

(18)

Christof Falter

13.1 Introduction

329

(1)

13.2 Stereo-to-Multichannel Upmix Processor

329

(7)

13.2.1 Product Description

329

(2)

13.2.2 Considerations for Professional Audio and Broadcast

331

(1)

13.2.3 Signal Processing

332

(4)

13.3 Digitally Enhanced Shotgun Microphone

336

(5)

13.3.1 Product Description

336

(1)

13.3.2 Concept

336

(1)

13.3.3 Signal Processing

336

(3)

13.3.4 Evaluations and Measurements

339

(2)

13.4 Surround Microphone System Based on Two Microphone Elements

341

(4)

13.4.1 Product Description

341

(3)

13.4.2 Concept

344

(1)

13.5 Summary

345

(2)

References

345

(2)

14 Spatial Sound Scene Synthesis and Manipulation for Virtual Reality and Audio Effects

347

(16)

Ville Pulkki

Archontis Politis

Tapani Pihlajamaki

Mikko-Ville Laitinen

14.1 Introduction

347

(1)

14.2 Parametric Sound Scene Synthesis for Virtual Reality

348

(7)

14.2.1 Overall Structure

348

(2)

14.2.2 Synthesis of Virtual Sources

350

(2)

14.2.3 Synthesis of Room Reverberation

352

(1)

14.2.4 Augmentation of Virtual Reality with Real Spatial Recordings

352

(1)

14.2.5 Higher-Order Processing

353

(1)

14.2.6 Loudspeaker-Signal Bus

354

(1)

14.3 Spatial Manipulation of Sound Scenes

355

(5)

14.3.1 Parametric Directional Transformations

356

(1)

14.3.2 Sweet-Spot Translation and Zooming

356

(1)

14.3.3 Spatial Filtering

356

(1)

14.3.4 Spatial Modulation

357

(1)

14.3.5 Diffuse Field Level Control

358

(1)

14.3.6 Ambience Extraction

359

(1)

14.3.7 Spatialization of Monophonic Signals

360

(1)

14.4 Summary

360

(3)

References

361

(2)

15 Parametric Spatial Audio Techniques in Teleconferencing and Remote Presence

363

(24)

Anastasios Alexandridis

Despoina Pavlidi

Nikolaos Stefanakis

Athanasios Mouchtaris

15.1 Introduction and Motivation

363

(2)

15.2 Background

365

(1)

15.3 Immersive Audio Communication System (ImmACS)

366

(10)

15.3.1 Encoder

366

(7)

15.3.2 Decoder

373

(3)

15.4 Capture and Reproduction of Crowded Acoustic Environments

376

(8)

15.4.1 Sound Source Positioning Based on VBAP

376

(1)

15.4.2 Non-Parametric Approach

377

(2)

15.4.3 Parametric Approach

379

(3)

15.4.4 Example Application

382

(2)

15.5 Conclusions

384

(3)

References

384

(3)

Index

387

VILLE PULKKI, PHD, is an Associate Professor leading the Communication Acoustics Research Group in the Department of Signal Processing and Acoustics, Aalto University, Finland. He has received distinguished medal awards from Society of Motion Picture and Television Engineers and from Audio Engineering Society.

SYMEON DELIKARIS-MANIAS is a postdoc researcher affiliated with the Communication Acoustics Research Group in the Department of Signal Processing and Acoustics at Aalto University, Finland.

ARCHONTIS POLITIS, PHD, is a postdoc researcher affiliated with the Communication Acoustics Research Group in the Department of Signal Processing and Acoustics at Aalto University and Tampere University of Technology in Finland.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97811192525802e.html

Märksõnad:

E-raamat: Parametric Time-Frequency Domain Spatial Audio

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv