Muutke küpsiste eelistusi

Building Dialogue POMDPs from Expert Dialogues: An end-to-end approach 1st ed. 2016 [Pehme köide]

  • Formaat: Paperback / softback, 119 pages, kõrgus x laius: 235x155 mm, kaal: 2058 g, 21 Illustrations, color; 1 Illustrations, black and white; VII, 119 p. 22 illus., 21 illus. in color., 1 Paperback / softback
  • Sari: SpringerBriefs in Speech Technology
  • Ilmumisaeg: 16-Feb-2016
  • Kirjastus: Springer International Publishing AG
  • ISBN-10: 3319261983
  • ISBN-13: 9783319261980
  • Pehme köide
  • Hind: 48,70 €*
  • * hind on lõplik, st. muud allahindlused enam ei rakendu
  • Tavahind: 57,29 €
  • Säästad 15%
  • Raamatu kohalejõudmiseks kirjastusest kulub orienteeruvalt 2-4 nädalat
  • Kogus:
  • Lisa ostukorvi
  • Tasuta tarne
  • Tellimisaeg 2-4 nädalat
  • Lisa soovinimekirja
  • Formaat: Paperback / softback, 119 pages, kõrgus x laius: 235x155 mm, kaal: 2058 g, 21 Illustrations, color; 1 Illustrations, black and white; VII, 119 p. 22 illus., 21 illus. in color., 1 Paperback / softback
  • Sari: SpringerBriefs in Speech Technology
  • Ilmumisaeg: 16-Feb-2016
  • Kirjastus: Springer International Publishing AG
  • ISBN-10: 3319261983
  • ISBN-13: 9783319261980

This book discusses the Partially Observable Markov Decision Process (POMDP) framework applied in dialogue systems. It presents POMDP as a formal framework to represent uncertainty explicitly while supporting automated policy solving. The authors propose and implement an end-to-end learning approach for dialogue POMDP model components. Starting from scratch, they present the state, the transition model, the observation model and then finally the reward model from unannotated and noisy dialogues. These altogether form a significant set of contributions that can potentially inspire substantial further work. This concise manuscript is written in a simple language, full of illustrative examples, figures, and tables.

1 Introduction
1(6)
1.1 An End-to-End Approach
5(2)
2 A Few Words on Topic Modeling
7(14)
2.1 Dirichlet Distribution
7(5)
2.1.1 Exponential Distributions
7(1)
2.1.2 Multinomial Distribution
8(1)
2.1.3 Dirichlet Distribution
9(1)
2.1.4 Example on the Dirichlet Distribution
10(2)
2.2 Latent Dirichlet Allocation
12(4)
2.3 Hidden Markov Models
16(5)
3 Sequential Decision Making in Spoken Dialog Management
21(24)
3.1 Sequential Decision Making
21(14)
3.1.1 Markov Decision Processes
23(1)
3.1.2 Partially Observable Markov Decision Processes
24(3)
3.1.3 Reinforcement Learning
27(1)
3.1.4 Solving MDPs/POMDPs
27(8)
3.2 Spoken Dialog Management
35(10)
3.2.1 MDP-Based Dialog Policy Learning
37(1)
3.2.2 POMDP-Based Dialog Policy Learning
38(2)
3.2.3 User Modeling in Dialog POMDPs
40(5)
4 Learning the Dialog POMDP Model Components
45(22)
4.1 Introduction
45(1)
4.2 Learning Intents as States
46(6)
4.2.1 Hidden Topic Markov Model for Dialogs
46(4)
4.2.2 Learning Intents from SACTI-1 Dialogs
50(2)
4.3 Learning the Transition Model
52(2)
4.4 Learning Observations and Observation Model
54(3)
4.4.1 Keyword Observation Model
55(1)
4.4.2 Intent Observation Model
56(1)
4.5 Example on SACTI Dialogs
57(7)
4.5.1 HTMM Evaluation
60(2)
4.5.2 Learned POMDP Evaluation
62(2)
4.6 Conclusions
64(3)
5 Learning the Reward Function
67(22)
5.1 Introduction
67(2)
5.2 IRL in the MDP Framework
69(6)
5.3 IRL in the POMDP Framework
75(9)
5.3.1 POMDP-IRL-BT
75(5)
5.3.2 PB-POMDP-IRL
80(3)
5.3.3 PB-POMDP-IRL Evaluation
83(1)
5.4 Related Work
84(2)
5.5 POMDP-IRL-MC
86(1)
5.6 POMDP-IRL-BT and PB-POMDP-IRL Performance
87(1)
5.7 Conclusions
88(1)
6 Application on Healthcare Dialog Management
89(20)
6.1 Introduction
89(2)
6.2 Dialog POMDP Model Learning for SmartWheeler
91(6)
6.2.1 Observation Model Learning
94(2)
6.2.2 Comparison of the Intent POMDP to the Keyword POMDP
96(1)
6.3 Reward Function Learning for SmartWheeler
97(9)
6.3.1 Choice of Features
98(1)
6.3.2 MDP-IRL Learned Rewards
99(2)
6.3.3 POMDP-IRL-BT Evaluation
101(1)
6.3.4 Comparison of POMDP-IRL-BT to POMDP-IRL-MC
102(4)
6.4 Conclusions
106(3)
7 Conclusions and Future Work
109(4)
7.1 Summary
109(2)
7.2 Future Work
111(2)
References 113
Hamidreza Chinaei is a postdoctoral fellow at the Computer Science Department in University of Toronto under the supervision of Dr. Frank Rudzicz through an NSERC Engage Fund with IBM Canada. Dr. Chinaei has received his PhD in 2013 in Computer Science from Laval University on the application of machine learning for speech and natural language processing tasks, and MMath in Computer Science from the University of Waterloo on semantic query optimization. He has received the Industrial Track Student Scholarship and Award from the 2012 Canadian AI Conference and the Best Student Paper Award from the International Conference on Agents and Artificial Intelligence in 2009.

Brahim Chaib-draa received a Diploma in Computer Engineering from the École Supérieure dÉlectricité (SUPELEC) de Paris, Paris, France, in 1978 and a Ph.D. degree in Computer Science from the Université du Hainaut-Cambrésis, Valenciennes, France, in 1990. In 1990, he joined the Department of Computer Science and Software Engineering at Laval University, Quebec, QC, Canada, where he is a Professor and Group Leader of the Decision for Agents and Multi-Agent Systems (DAMAS) Group. His research interests include agent and multiagent computing, machine learning and complex decision making. He is the author of several technical publications. Dr. Chaib-draa is a member of ACM and AAAI and senior member of the IEEE Computer Society.