Learning Automata based Approach to Model Dialogue Strategy in Spoken Dialogue System: A Performance Evaluation G.Kumaravelan Pondicherry University, Karaikal.

Slides:



Advertisements
Similar presentations
Annual Conference of ITA ACITA 2009 Chukwuemeka D. Emele, Timothy J. Norman, Frank GuerinSimon Parsons Computing Science Department, University of Aberdeen,
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Irwin/McGraw-Hill Copyright © 2000 The McGraw-Hill Companies. All Rights reserved Whitten Bentley DittmanSYSTEMS ANALYSIS AND DESIGN METHODS5th Edition.
Dialogue Policy Optimisation
Statistical Dialogue Modelling Milica Gašić Dialogue Systems Group.
Partially Observable Markov Decision Process (POMDP)
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.
Pradeep Varakantham Singapore Management University Joint work with J.Y.Kwak, M.Taylor, J. Marecki, P. Scerri, M.Tambe.
An Introduction to Markov Decision Processes Sarah Hickmott
Partially Observable Markov Decision Process By Nezih Ergin Özkucur.
Planning under Uncertainty
Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.
Spoken Dialogue Technology How can Jerry Springer contribute to Computer Science Research Projects?
Lecture 13 Revision IMS Systems Analysis and Design.
Reinforcement Learning Mitchell, Ch. 13 (see also Barto & Sutton book on-line)
Cooperative Q-Learning Lars Blackmore and Steve Block Expertness Based Cooperative Q-learning Ahmadabadi, M.N.; Asadpour, M IEEE Transactions on Systems,
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Article Review: Spoken Dialogue Technology: Enabling the Conversational User MICHAEL F.M C TEAR University of Ulster University of Ulster This article.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Course Instructor: Aisha Azeem
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Vedrana Vidulin Jožef Stefan Institute, Ljubljana, Slovenia
Presenter : Cheng-Ta Wu Vijay D’silva, S. Ramesh Indian Institute of Technology Bombay Arcot Sowmya University of New South Wales, Sydney.
Introduction to Computer and Programming CS-101 Lecture 6 By : Lecturer : Omer Salih Dawood Department of Computer Science College of Arts and Science.
Instructor: Vincent Conitzer
MAKING COMPLEX DEClSlONS
Conference Paper by: Bikramjit Banerjee University of Southern Mississippi From the Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.
REINFORCEMENT LEARNING LEARNING TO PERFORM BEST ACTIONS BY REWARDS Tayfun Gürel.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
Speech and Language Processing Chapter 24 of SLP (part 3) Dialogue and Conversational Agents.
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
1-1 System Development Process System development process – a set of activities, methods, best practices, deliverables, and automated tools that stakeholders.
(Particle Swarm Optimisation)
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
Reinforcement Learning for Spoken Dialogue Systems: Comparing Strengths & Weaknesses for Practical Deployment Tim Paek Microsoft Research Dialogue on Dialogues.
Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.
Introduction to Reinforcement Learning Dr Kathryn Merrick 2008 Spring School on Optimisation, Learning and Complexity Friday 7 th.
Chapter 16. Basal Ganglia Models for Autonomous Behavior Learning in Creating Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans.
Privacy-Preserving Bayes-Adaptive MDPs CS548 Term Project Kanghoon Lee, AIPR Lab., KAIST CS548 Advanced Information Security Spring 2010.
1 CS 224S W2006 CS 224S LING 281 Speech Recognition and Synthesis Lecture 15: Dialogue and Conversational Agents (III) Dan Jurafsky.
16.0 Spoken Dialogues References: , Chapter 17 of Huang 2. “Conversational Interfaces: Advances and Challenges”, Proceedings of the IEEE,
Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.
Intelligent Robot Architecture (1-3)  Background of research  Research objectives  By recognizing and analyzing user’s utterances and actions, an intelligent.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Cooperative Q-Learning Lars Blackmore and Steve Block Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents Tan, M Proceedings of the.
George F Luger ARTIFICIAL INTELLIGENCE 6th edition Structures and Strategies for Complex Problem Solving Machine Learning: Probabilistic Luger: Artificial.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.
Vedrana Vidulin Jožef Stefan Institute, Ljubljana, Slovenia
Intelligent Control Methods Lecture 2: Artificial Intelligence Slovak University of Technology Faculty of Material Science and Technology in Trnava.
Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.
By: Nicole Cappella. Why I chose Speech Recognition  Always interested me  Dr. Phil Show Manti Teo Girlfriend Hoax  Three separate voice analysts proved.
1 Spoken Dialogue Systems Dialogue and Conversational Agents (Part III) Chapter 19: Draft of May 18, 2005 Speech and Language Processing: An Introduction.
CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.
Reinforcement learning for dialogue systems
Lecture #1 Introduction
Making complex decisions
Markov Decision Processes
Planning to Maximize Reward: Markov Decision Processes
Markov Decision Processes
Metaheuristic methods and their applications. Optimization Problems Strategies for Solving NP-hard Optimization Problems What is a Metaheuristic Method?
Chapter 17 – Making Complex Decisions
CS 416 Artificial Intelligence
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
A Deep Reinforcement Learning Approach to Traffic Management
Presentation transcript:

Learning Automata based Approach to Model Dialogue Strategy in Spoken Dialogue System: A Performance Evaluation G.Kumaravelan Pondicherry University, Karaikal Centre, Karaikal. R. SivaKumar AVVM Sri Pushpam College, Thanjavur.

Dialogue System A system to provide interface between the user and a computer-based application Interact on turn-by-turn basis Dialogue manager  Control the flow of the dialogue information gathering from user communicating with external application communicating information back to the user  Three types of dialogue system (On initiativeness) finite state- (or graph-) based frame-based agent-based

Spoken Dialogue System Architecture Speech Recognition Dialogue Manager Back end Language Generation Text to Speech Synthesis Audio Spoken Language Understanding Words Semantic representation Concepts Words Audio

Properties of RL: The Agent-Environment Interaction

Cont… Immediate reward Long term reward Aim is to find the policy that leads to the highest total reward over T time steps (finite horizon) [ Markov property]

The formal decision problem - MDP Given S is a finite state set (with start state s 0 ) A is a finite action set P(s’ | s, a) is a table of transition probabilities R(s, a, s’) is a reward function Policy  (s, t) = a Is there a policy that yields total reward over finite horizon T

Learning Automata Characteristics Learning Automata (LA) are adaptive decision making devices that can operate in environments where  they have no information about the effect of their actions at start of operation — unknown environments  a given action not necessarily produces the same response each time it is performed — non-deterministic environments A powerful property of LA is that they progressively improve their performance by the means of a learning process  combine rapid and accurate convergence with low computational complexity.

Learning Automaton and its interaction with the environment Set of Actions A = { a1, …an } Response β = { 0, 1 }

Methodology Follows a frame based approach which maintains task and attribute histories respect to the domain in focus. The state space is determined by the number of slots in focus. The action space is narrowed to “greeting”, “request all”, “request n slot”, “verify all”, “verify n slot” and close dialogue.

Experiments and Results Our experiments were based on travel planning domain. Speech recognition & speech synthesis modules are implement by.NET SDK framework. DATE scheme was used as a dialogue act recognition agent. The reward in the range of (+10 to -5) is assigned for the best and worst action selection respectively.

Cont…

Evaluation Methodology PRADISE framework Task success: Calculated with help of AVM. System performance:

Conclusions Challenges LA are interesting building blocks to solve different type of RL problems Faster learning Knowledge transfer Less Computational complexity Different LA updates Influence different state observations (POMDP setting)

References I E. Levin, R. Pieraccini and W. Eckert. A stochastic model of human-machine interaction for learning dialogue strategies. IEEE Trans. on Speech and Audio Processing, 8(1), pp. 11–23, M. McTear. Spoken dialogue technology: Toward the conversational user interface. Springer, K. Narendra and M.A.L. Thathachar. Learning Automata: An Introduction. Prentice-Hall International, Inc, A. Nowe and K. Verbeeck. Colonies of learning Automata. IEEE Trans. Syst. Man Cybern B, 32, pp , T. Peak and R. Pieraccini. Automating spoken dialogue management design using machine learning: an industry perspective. Speech Communication, 50, pp , O. Pietquin and T. Dutoit. A probabilistic framework for dialogue simulation and optimal strategy learning. IEEE Transactions on Speech and Audio Processing, 14(2), pp. 589–599, 2006.

References II K. Scheffler and S. Young. Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning. Human Language Technology Conference (HLT), pp 12–19, S. Singh, D. Litman, and M. Kearns. Optimizing dialogue management with reinforcement learning: Experiments with the NJFun system, Journal of Artificial Intelligence Research, 16, pp. 105–133, M. A. L. Thathachar and P. S. Sastry. Networks of Learning Automata: Techniques for Online Stochastic Optimization. Norwell, MA, Kluwer, M. Walker and R. Passonneau DATE:A dialogue act tagging scheme for evaluation of spoken dialogue systems. Proceedings of the Human Language Technology Conference, pp. 1–8, M. Walker, D. Litman and C. Kamm. PARADISE: A framework for evaluating spoken dialogue agents. Proc. of the 5th annual meeting of the association for computational linguistics, pp. 271–280,

Questions ?