Klienditugi: 7440010 (E-R 10-18)

E-raamat: Multimodal Behavior Analysis in the Wild: Advances and Challenges

Edited by Xavier Alameda-Pineda (Research Scientist, INRIA), Edited by Nicu Sebe (Full Professor, University of Trento, Italy), Edited by Elisa Ricci (Researcher at FBK, Assistant Professor at the University of Perugia)

Formaat: PDF+DRM
Sari: Computer Vision and Pattern Recognition
Ilmumisaeg: 13-Nov-2018
Kirjastus: Academic Press Inc
Keel: eng
ISBN-13: 9780128146026

Teised raamatud teemal:

Formaat - PDF+DRM
Hind: 184,28 €*
* hind on lõplik, st. muud allahindlused enam ei rakendu
Lisa ostukorvi
Lisa soovinimekirja
See e-raamat on mõeldud ainult isiklikuks kasutamiseks. E-raamatuid ei saa tagastada.

Formaat: PDF+DRM
Sari: Computer Vision and Pattern Recognition
Ilmumisaeg: 13-Nov-2018
Kirjastus: Academic Press Inc
Keel: eng
ISBN-13: 9780128146026

Teised raamatud teemal:

DRM piirangud

Kopeerimine (copy/paste):

ei ole lubatud
Printimine:

ei ole lubatud
Kasutamine:

Digitaalõiguste kaitse (DRM)
Kirjastus on väljastanud selle e-raamatu krüpteeritud kujul, mis tähendab, et selle lugemiseks peate installeerima spetsiaalse tarkvara. Samuti peate looma endale Adobe ID Rohkem infot siin. E-raamatut saab lugeda 1 kasutaja ning alla laadida kuni 6'de seadmesse (kõik autoriseeritud sama Adobe ID-ga).

Vajalik tarkvara
Mobiilsetes seadmetes (telefon või tahvelarvuti) lugemiseks peate installeerima selle tasuta rakenduse: PocketBook Reader (iOS / Android)

PC või Mac seadmes lugemiseks peate installima Adobe Digital Editionsi (Seeon tasuta rakendus spetsiaalselt e-raamatute lugemiseks. Seda ei tohi segamini ajada Adober Reader'iga, mis tõenäoliselt on juba teie arvutisse installeeritud )

Seda e-raamatut ei saa lugeda Amazon Kindle's.

Multimodal Behavioral Analysis in the Wild: Advances and Challenges presents the state-of- the-art in behavioral signal processing using different data modalities, with a special focus on identifying the strengths and limitations of current technologies. The book focuses on audio and video modalities, while also emphasizing emerging modalities, such as accelerometer or proximity data. It covers tasks at different levels of complexity, from low level (speaker detection, sensorimotor links, source separation), through middle level (conversational group detection, addresser and addressee identification), and high level (personality and emotion recognition), providing insights on how to exploit inter-level and intra-level links.

This is a valuable resource on the state-of-the- art and future research challenges of multi-modal behavioral analysis in the wild. It is suitable for researchers and graduate students in the fields of computer vision, audio processing, pattern recognition, machine learning and social signal processing.

Gives a comprehensive collection of information on the state-of-the-art, limitations, and challenges associated with extracting behavioral cues from real-world scenarios
Presents numerous applications on how different behavioral cues have been successfully extracted from different data sources
Provides a wide variety of methodologies used to extract behavioral cues from multi-modal data

List of Contributors

xiii

About the Editors

xix

Multimodal behavior analysis in the wild: An introduction

(8)

Xavier Alameda-Pineda

Elisa Ricci

Nicu Sebe

0.1 Analyzing human behavior in the wild from multimodal data

(2)

0.2 Scope of the book

(3)

0.3 Summary of important points

(1)

References

(2)

Chapter 1 Multimodal open-domain conversations with robotic platforms

(18)

Kristiina Jokinen

Graham Wilcock

1.1 Introduction

(5)

1.1.1 Constructive Dialog Model

(3)

1.2 Open-domain dialogs

(4)

1.2.1 Topic shifts and topic trees

(2)

1.2.2 Dialogs using Wikipedia

(2)

1.3 Multimodal dialogs

(3)

1.3.1 Multimodal WikiTalk for robots

(1)

1.3.2 Multimodal topic modeling

(1)

1.4 Future directions

(2)

1.4.1 Dialogs using domain ontologies

(1)

1.4.2 IoT and an integrated robot architecture

(1)

1.5 Conclusion

(1)

References

(3)

Chapter 2 Audio-motor integration for robot audition

(26)

Antoine Deleforge

Alexander Schmidt

Walter Kellermann

2.1 Introduction

(2)

2.2 Audio-motor integration in psychophysics and robotics

(3)

2.3 Single-microphone sound localization using head movements

(5)

2.3.1 HRTF model and dynamic cues

(2)

2.3.2 Learning-based sound localization

(2)

2.3.3 Results

(1)

2.4 Ego-noise reduction using proprioceptors

(8)

2.4.1 Ego-noise: challenges and opportunities

(1)

2.4.2 Proprioceptor-guided dictionary learning

(2)

2.4.3 Phase-optimized dictionary learning

(2)

2.4.4 Audio-motor integration via support vector machines

(3)

2.4.5 Results

(1)

2.5 Conclusion and perspectives

(1)

References

(7)

Chapter 3 Audio source separation into the wild

(26)

Laurent Girin

Sharon Gannot

Xiaofei Li

3.1 Introduction

(1)

3.2 Multichannel audio source separation

(4)

3.3 Making MASS go from labs into the wild

(10)

3.3.1 Moving sources and sensors

(3)

3.3.2 Varying number of (active) sources

(2)

3.3.3 Spatially diffuse sources and long mixing filters

(4)

3.3.4 Ad hoc microphone arrays

(1)

3.4 Conclusions and perspectives

(2)

References

(9)

Chapter 4 Designing audio-visual tools to support multisensory disabilities

(24)

Nicoletta Noceti

Luca Giuliani

Joan Sosa-Garcia

Luca Brayda

Andrea Trucco

Francesca Odone

4.1 Introduction

(3)

4.2 Related works

(3)

4.3 The Glassense system

(7)

4.4 Visual recognition module

(1)

4.4.1 Object-instance recognition

(1)

4.4.2 Experimental assessment

(3)

4.5 Complementary hearing aid module

(3)

4.5.1 Measurement of Glassense beam pattern

(1)

4.5.2 Analysis of measured beam pattern

(2)

4.6 Assessing usability with impaired users

(3)

4.6.1 Glassense field tests with visually impaired

(1)

4.6.2 Glassense field tests with binaural hearing loss

(2)

4.7 Conclusion

(1)

References

(4)

Chapter 5 Audio-visual learning for body-worn cameras

103

(18)

Andrea Cavallaro

Alessio Brutti

5.1 Introduction

103

(2)

5.2 Multi-modal classification

105

(2)

5.3 Cross-modal adaptation

107

(3)

5.4 Audio-visual reidentification

110

(1)

5.5 Reidentification dataset

111

(1)

5.6 Reidentification results

112

(4)

5.7 Closing remarks

116

(1)

References

116

(5)

Chapter 6 Activity recognition from visual lifelogs: State of the art and future challenges

121

(14)

Mariella Dimiccoli

Alejandro Cartas

Petia Radeva

6.1 Introduction

121

(2)

6.2 Activity recognition from egocentric images

123

(2)

6.3 Activity recognition from egocentric photo-streams

125

(2)

6.4 Experimental results

127

(4)

6.4.1 Experimental setup

127

(1)

6.4.2 Implementation

128

(1)

6.4.3 Results and discussion

129

(2)

6.5 Conclusion

131

(1)

Acknowledgments

132

(3)

References l

l32

Chapter 7 Lifelog retrieval for memory stimulation of people with memory impairment

135

(24)

Gabriel Oliveira-Barra

Marc Bolanos

Estefania Talavera

Olga Gelonch

Maite Garolera

Petia Radeva

7.1 Introduction

135

(3)

7.2 Related work

138

(3)

7.3 Retrieval based on key-frame semantic selection

141

(8)

7.3.1 Summarization of autobiographical episodes

143

(1)

7.3.2 Semantic key-frame selection

144

(2)

7.3.3 Egocentric image retrieval based on CNNs and inverted index search

146

(3)

7.4 Experiments

149

(5)

7.4.1 Dataset

149

(1)

7.4.2 Experimental setup

150

(1)

7.4.3 Evaluation measures

151

(1)

7.4.4 Results

151

(2)

7.4.5 Discussions

153

(1)

7.5 Conclusions

154

(1)

Acknowledgments

154

(1)

References

155

(4)

Chapter 8 Integrating signals for reasoning about visitors' behavior in cultural heritage

159

(12)

Tsvi Kuflik

Eyal Dim

8.1 Introduction

159

(2)

8.2 Using technology for reasoning about visitors' behavior

161

(5)

8.3 Discussion

166

(1)

8.4 Conclusions

167

(1)

References

168

(3)

Chapter 9 Wearable systems for improving tourist experience

171

(28)

Lorenzo Seidenari

Claudio Baecchi

Tiberio Uricchio

Andrea Ferracani

Marco Bertini

Alberto Del Bimbo

9.1 Introduction

171

(1)

9.2 Related work

172

(4)

9.3 Behavior analysis for smart guides

176

(1)

9.4 The indoor system

176

(10)

9.5 The outdoor system

186

(8)

9.6 Conclusions

194

(1)

References

195

(4)

Chapter 10 Recognizing social relationships from an egocentric vision perspective

199

(26)

Stefano Alletto

Marcella Cornia

Lorenzo Baraldi

Giuseppe Serra

Rita Cucchiara

10.1 Introduction

199

(3)

10.2 Related work

202

(2)

10.2.1 Head pose estimation

202

(1)

10.2.2 Social interactions

203

(1)

10.3 Understanding people interactions

204

(6)

10.3.1 Face detection and tracking

205

(1)

10.3.2 Head pose estimation

205

(3)

10.3.3 3D people localization

208

(2)

10.4 Social group detection

210

(2)

10.4.1 Correlation clustering via structural SVM

210

(2)

10.5 Social relevance estimation

212

(1)

10.6 Experimental results

213

(9)

10.6.1 Head pose estimation

214

(1)

10.6.2 Distance estimation

215

(1)

10.6.3 Groups estimation

216

(4)

10.6.4 Social relevance

220

(2)

10.7 Conclusions

222

(1)

References

223

(2)

Chapter 11 Complex conversational scene analysis using wearable sensors

225

(22)

Hayley Hung

Ekin Gedik

Laura Cabrera Quiros

11.1 Introduction

225

(2)

11.2 Defining 'in the wild' and ecological validity

227

(1)

11.3 Ecological validity vs. experimental control

228

(1)

11.4 Ecological validity vs. robust automated perception

229

(1)

11.5 Thin vs. thick slices of analysis

230

(1)

11.6 Collecting data of social behavior

230

(4)

11.6.1 Practical concerns when collecting data during social events

231

(3)

11.7 Analyzing social actions with a single body worn accelerometer

234

(7)

11.7.1 Feature extraction and classification

235

(1)

11.7.2 Performance vs. sample size

236

(2)

11.7.3 Transductive parameter transfer (TPT) for personalized models

238

(3)

11.7.4 Discussion

241

(1)

11.8
Chapter summary

241

(1)

References

242

(5)

Chapter 12 Detecting conversational groups in images using clustering games

247

(22)

Sebastiano Vascon

Marcello Pelillo

12.1 Introduction

247

(3)

12.2 Related work

250

(1)

12.3 Clustering games

251

(4)

12.3.1 Notations and definitions

251

(2)

12.3.2 Clustering games

253

(2)

12.4 Conversational groups as equilibria of clustering games

255

(3)

12.4.1 Frustum of attention

255

(2)

12.4.2 Quantifying pairwise interactions

257

(1)

12.4.3 The algorithm

258

(1)

12.5 Finding ESS-clusters using game dynamics

258

(3)

12.6 Experiments and results

261

(4)

12.6.1 Datasets

261

(1)

12.6.2 Evaluation metrics and parameter exploration

262

(1)

12.6.3 Experiments

263

(2)

12.7 Conclusions

265

(1)

References

265

(4)

Chapter 13 We are less free than how we think: Regular patterns in nonverbal communication

269

(20)

Alessandro Vinciarelli

Anna Esposito

Mohammad Tayarani

Giorgio Roffo

Filomena Scibelli

Francesco Perrone

Dong-Bach Vo

13.1 Introduction

269

(2)

13.2 On spotting cues: how many and when

271

(5)

13.2.1 The cues

272

(1)

13.2.2 Methodology

273

(2)

13.2.3 Results

275

(1)

13.3 On following turns: who talks with whom

276

(3)

13.3.1 Conflict

277

(1)

13.3.2 Methodology

278

(1)

13.3.3 Results

278

(1)

13.4 On speech dancing: who imitates whom

279

(7)

13.4.1 Methodology

279

(3)

13.4.2 Results

282

(4)

13.5 Conclusions

286

(1)

References

287

(2)

Chapter 14 Crowd behavior analysis from fixed and moving cameras

289

(34)

Pierre Bour

Emile Cribelier

Vasileios Argyriou

14.1 Introduction

289

(4)

14.2 Microscopic and macroscopic crowd modeling

293

(2)

14.3 Motion information for crowd representation from fixed cameras

295

(3)

14.3.1 Pre-processing and selection of areas of interest

295

(1)

14.3.2 Motion-based crowd behavior analysis

296

(2)

14.4 Crowd behavior and density analysis

298

(3)

14.4.1 Person detection and tracking in crowded scenes

299

(1)

14.4.2 Low level features for crowd density estimation

300

(1)

14.5 CNN-based crowd analysis methods for surveillance and anomaly detection

301

(6)

14.6 Crowd analysis using moving sensors

307

(3)

14.7 Metrics and datasets

310

(4)

14.7.1 Metrics for performance evaluation

310

(2)

14.7.2 Datasets for crowd behavior analysis

312

(2)

14.8 Conclusions

314

(1)

References

315

(8)

Chapter 15 Towards multi-modality invariance: A study in visual representation

323

(26)

Lingxi Xie

Qi Tian

15.1 Introduction and related work

323

(3)

15.2 Variances in visual representation

326

(2)

15.3 Reversal invariance in BoVW

328

(9)

15.3.1 Reversal symmetry and Max-SIFT

329

(1)

15.3.2 RIDE: generalized reversal invariance

330

(2)

15.3.3 Application to image classification

332

(1)

15.3.4 Experiments

332

15.3.5 Summary

136

(201)

15.4 Reversal invariance in CNN

337

(7)

15.4.1 Reversal-invariant convolution (RI-Cony)

337

(1)

15.4.2 Relationship to data augmentation

338

(2)

15.4.3 CIFAR experiments

340

(1)

15.4.4 ILSVRC2012 classification experiments

341

(2)

15.4.5 Summary

343

(1)

15.5 Conclusions

344

(1)

References

344

(5)

Chapter 16 Sentiment concept embedding for visual affect recognition

349

(20)

Victor Campos

Xavier Giro-i-Nieto

Brendan Jou

Jordi Torres

Shih-Fu Chang

16.1 Introduction

149

(203)

16.1.1 Embeddings for image classification

350

(1)

16.1.2 Affective computing

351

(1)

16.2 Visual sentiment ontology

352

(1)

16.3 Building output embeddings for ANPs

353

(4)

16.3.1 Combining adjectives and nouns

354

(2)

16.3.2 Loss functions for the embeddings

356

(1)

16.4 Experimental results

357

(5)

16.4.1 Adjective noun pair detection

358

(3)

16.4.2 Zero-shot concept detection

361

(1)

16.5 Visual affect recognition

362

(3)

16.5.1 Visual emotion prediction

363

(1)

16.5.2 Visual sentiment prediction

364

(1)

16.6 Conclusions and future work

365

(1)

References

366

(3)

Chapter 17 Video-based emotion recognition in the wild

369

(18)

Albert Ali Salah

Heysem Kaya

Furkan Gurpinar

17.1 Introduction

369

(1)

17.2 Related work

370

(4)

17.3 Proposed approach

374

(2)

17.4 Experimental results

376

(3)

17.4.1 EmotiW Challenge

376

(2)

17.4.2 ChaLearn Challenges

378

(1)

17.5 Conclusions and discussion

379

(3)

Acknowledgments

382

(1)

References

382

(5)

Chapter 18 Real-world automatic continuous affect recognition from audiovisual signals

387

(20)

Panagiotis Tzirakis

Stefanos Zafeiriou

Bjorn Schuller

18.1 Introduction

387

(2)

18.2 Real world vs laboratory settings

389

(1)

18.3 Audio and video affect cues and theories of emotion

389

(3)

18.3.1 Audio signals

389

(1)

18.3.2 Visual signals

390

(1)

18.3.3 Quantifying affect

391

(1)

18.4 Affective computing

392

(5)

18.4.1 Multimodal fusion techniques

392

(1)

18.4.2 Related work

393

(1)

18.4.3 Databases

394

(2)

18.4.4 Affect recognition competitions

396

(1)

18.5 Audiovisual affect recognition: a representative end-to-end learning system

397

(5)

18.5.1 Proposed model

398

(2)

18.5.2 Experiments & results

400

(2)

18.6 Conclusions

402

(1)

References

403

(4)

Chapter 19 Affective facial computing: Generalizability across domains

407

(36)

Jeffrey F. Cohn

Itir Onal Ertugrul

Wen-Sheng Chu

Jeffrey M. Girard

Laszlo A. Jeni

Zakia Hammal

19.1 Introduction

407

(2)

19.2 Overview of AFC

409

(1)

19.3 Approaches to annotation

410

(1)

19.4 Reliability and performance

411

(2)

19.5 Factors influencing performance

413

(2)

19.6 Systematic review of studies of cross-domain generalizability

415

(14)

19.6.1 Study selection

416

(1)

19.6.2 Databases

416

(3)

19.6.3 Cross-domain generalizability

419

(8)

19.6.4 Studies using deep- vs. shallow learning

427

(1)

19.6.5 Discussion

428

(1)

19.7 New directions

429

(4)

19.8 Summary

433

(1)

Acknowledgments

434

(1)

References

434

(9)

Chapter 20 Automatic recognition of self-reported and perceived emotions

443

(28)

Biqiao Zhang

Emily Mower Provost

20.1 Introduction

443

(2)

20.2 Emotion production and perception

445

(3)

20.2.1 Descriptions of emotion

445

(1)

20.2.2 Brunswik's functional lens model

446

(2)

20.2.3 Appraisal theory

448

(1)

20.3 Observations from perception experiments

448

(2)

20.4 Collection and annotation of labeled emotion data

450

(3)

20.4.1 Emotion-elicitation methods

450

(2)

20.4.2 Data annotation tools

452

(1)

20.5 Emotion datasets

453

(4)

20.5.1 Text datasets

453

(2)

20.5.2 Audio, visual, physiological, and multi-modal datasets

455

(2)

20.6 Recognition of self-reported and perceived emotion

457

(3)

20.7 Challenges and prospects

460

(3)

20.8 Concluding remarks 463

Acknowledgments

463

(1)

References

463

(8)

Index

471

Xavier Alameda-Pineda received his PhD from INRIA and University of Grenoble in2013. He was a post-doctoral researcher at CNRS/GIPSA-Lab and at the University of Trento, in the deep relational learning group. He is a research scientist at INRIA working on signal processing and machine learning for scene and behavior understanding using multimodal data. He is the winner of the best paper award of ACM MM 2015, the best student paper award at IEEE WASPAA 2015 and the best scientific paper award on image, speech, signal and video processing at IEEE ICPR 2016. He is member of IEEE and of ACM SIGMM. Elisa Ricci is a researcher at FBK and an assistant professor at University of Perugia. She received her PhD from the University of Perugia in 2008. She has since been a postdoctoral researcher at Idiap and FBK, Trento and a visiting researcher at University of Bristol. Her research interests are directed along developing machine learning algorithms for video scene analysis, human behaviour understanding and multimedia content analysis. She is area chair of ACM MM 2016 and of ECCV 2016. She received the IBM Best Student Paper Award at ICPR 2014. Nicu Sebe is a full professor at the University of Trento, Italy, where he is leading the research in the areas of multimedia information retrieval and human behavior understanding. He was a general co-chair of FG 2008 and ACM MM 2013, and a program chair of CIVR 2007 and 2010, of ACM MM 2007 and 2011, and of ECCV 2016. He is a program chair of ICCV 2017 and of ICPR 2020, and a general chair of ICMR 2017. He is a senior member of IEEE and ACM and a fellow of IAPR.

Lisainfo e-raamatute kohta

Püsilink: https://www.kriso.ee/db/97801281460262e.html

Märksõnad:

E-raamat: Multimodal Behavior Analysis in the Wild: Advances and Challenges

DRM piirangud

Kopeerimine (copy/paste):

Printimine:

Kasutamine:

Konto & seaded

Otsing

Otsingu andmebaas

Filtreeri tulemusi

Teemad E-raamatute teemad

Vali ostukorv