1 Dr Alexiei Dingli Introduction to Web Science Reusing knowledge
2 Acquire Model Reuse Retrieve Publish Maintain Six challenges of the Knowledge Life Cycle
3 Three reusable types of objects –Ontologies –Problem Solving Methods –Knowledge Bases Plus we can also use additional sources (WWW) Reusing knowledge
4 Locating the knowledge to be reused is difficult Distributed agents may be unaware that the knowledge they need is available (this is the challenge of knowledge retrieval) Knowledge may simply be in the wrong form for the task Problems with reuse
5 Question answering Dialogue systems Two particular reuse tasks
6 The main aim of QA is to present the user with a short answer to a question rather than a list of possibly relevant documents. As it becomes more and more difficult to find answers on the WWW using standard search engines, question answering technology will become increasingly important. What is Question Answering?
7 Clearly there are many different types of questions: –When was Mozart born? Question requires a single fact as an answer. Answer may be found verbatim in text i.e. “Mozart was born in 1756”. –How did Socrates die? Finding an answer may require reasoning. In this example die has to be linked with drinking poisoned wine. Question Types (1)
8 –How do I assemble a bike? The full answer may require fusing information from many different sources. The complexity can range from simple lists to script-based answers. –Is the Earth flat? Requires a simple yes/no answer. Question Types (2)
9 The biggest independent evaluations of question answering systems have been carried out at TREC (Text Retrieval Conference) –Five hundred factoid questions are provided and the groups taking part have a week in which to process the questions and return one answer per question. –No changes are allowed to your system between the time you receive the questions and the time you submit the answers. Evaluating QA Systems
10 A Generic QA Framework A search engine is used to find the n most relevant documents in the document collection These documents are then processed with respect to the question to produce a set of answers which are passed back to the user Most of the differences between question answering systems are centred around the document processing stage
11 The answers to the majority of factoid questions are easily recognised named entities, such as countries, cities, dates, peoples names, etc The relatively simple techniques of gazetteer lists and named entity recognisers allow us to locate these entities within the relevant documents – the most frequent of which can be returned as the answer This leaves just one issue that needs solving – how do we know, for a specific question, what the type of the answer should be A Simplified Approach
12 The simplest way to determine the expected type of an answer is to look at the words which make up the question: who – suggests a person when – suggests a date where – suggests a location A Simplified Approach (1)
13 Clearly this division does not account for every question but it is easy to add more complex rules: country – suggests a location how much – suggests an amount of money author – suggests a person birthday – suggests a date college – suggests an organization These rules can be easily extended as we think of more questions to ask A Simplified Approach (2)
14 The most frequently occurring instance of the right type might not be the correct answer. –For example if you are asking when someone was born, it maybe that their death was more notable and hence will appear more often (e.g. John F Kennedy’s assassination). There are many questions for which correct answers are not named entities: –How did Ayrton Senna die? – in a car crash Problems (1)
15 The gazetteer lists and named entity recognisers are unlikely to cover every type of named entity that may be asked about: –Even those types that are covered may well not be complete. –It is of course relatively easy to build new lists, e.g. Birthstones. Problems (2)
16 Amber Precious Diamond Asia Summer Holly Are these person’s names? Does a gazetteer of people names contains all the names?
17 A sequence of utterances Exchange of information among multiple dialogue participants Stays coherent over the time Driven by certain goal –finding the most suitable restaurant in a foreign city, –booking the cheapest flight to a given city, –controlling the state of the devices in a home, –or the goal might also be the interaction itself (chatting) Dialogue (1)
18 Most natural means for communication for humans perceived as a very expressive, efficient and robust However, dialogue is very complex protocol –follow certain conventions or protocols that are adopted by participants –humans usually use their extensive knowledge and reasoning capabilities to understand the conversational partner –the dialogue utterances are often imperfect – ungrammatical or elliptical Dialogue (2)
19 People often utter partial phrases to avoid repetition –A: At what time is “Titanic” playing? –B: 8pm –A: And “The 5th element”? It is necessary to keep track of the conversation to complete such phrases Ellipsis
20 Some words can only be interpreted in context: –Previous context (anaphora) “The monkey took the banana and ate it” –Future context (cataphora) “Give me that. The book by the lamp.” –Temporal/spatial “The man behind me will be dead tomorrow.” (Who is the man? When he died/dies?) Deixis
21 The meaning of a discourse may be far from literal. –B: I can’t reach him. –A: There is the telephone. –B: I am not in my office. –A: Okay. Undertones & implications are often employed for effect or efficiency Indirect Meaning
22 People seem to know very well when they can take their turn –There is little overlap (5%) –Gaps are often a few 1/10ths of a second –Appears fluid, but not obvious why A computational model of overlap does not exists –causes problem for dialogue systems Turn Taking
23 Phrases like “a-ha”, “yes”, “hmm” or “eh” are often prompted in order to fill the pauses of the conversation, to indicate the attention or reflection The challenge here is to recognize when they should be understood as a request for turn taking and when they should be ignored Conversational fillers
24 Flight and train timetable information and reservation Smart homes Automated directory enquires –Yellow pages enquires –Weather information Most common dialogue domain
25 Components of a Dialogue System
26 Transforms speech to text Two basic types –Grammar-based ASR The set of accepted phrases defined by regular/context-free grammars (i.e. language model in the form of a grammar) Usually speaker independent –Dictation machine Recognizes “any utterance” N-gram language model Often speaker dependent Automatic Speech Recognition
27 Analyzes textual utterance and returns its formal semantic representation –Logical formula –Named entities –etc Natural Language Understanding
28 Coordinates activity of all components Maintains representation of the current state of the dialogue Communicates with external applications Decides about the next dialogue step Dialogue Manager
29 Finite-state –dialogue flow determined by a finite state automata Frame-based –form filling Plan (task) based –a dynamic plan is constructed to reach the dialogue goal … in practice, you often find an extended versions or combinations of above mentioned approaches! Three types of DM
30 Finite State Automata
31 Frame Based
32 Take a problem solving approach –There are goals to be reached –Plans are made to reach those goals –The goals and plans of the other participants must be iteratively inferred or predicted Potential for handling complicated dialogues –suffers from today’s technological limitation –in more complex cases the planning problem can become computationally intractable Examples: Bathroom consultant Plan Based
33 Produces a textual utterance (so called surface realization) from an internal (formal) representation of the answer The surface realization can include formatting information –Speaking style, pauses –Background sounds Natural Language Generation
34 Transforms the surface realization into a an acoustic representation (sound signal) Text-To-Speech
35 Commercial systems: –small vocabulary (~100 words) –closed domain –system initiative Research systems: –larger (but still small) vocabulary (~10000 words) –closed domain –(limited) mixed initiative Typical parameters
36 System-initiative –system always has control, user only responds to system questions User-initiative: –user always has control, system passively answers user questions Mixed-initiative: –control switches between system and user using fixed rules Variable-initiative: –control switches between system and user dynamically based on participant roles, dialogue history, etc. Different Initiatives
37 Several possible input/output modalities to communicate with dialogue systems –speech, text, pointing, graphics, gestures, face configurations, body positions, emotions, etc. Not single “most convenient” modality (different modalities have different advantages) –entering day of week: click on a calendar –entering Zip code: use keyboard –performing commands: speech –complex query: express them as typed natural language Several modalities useful –when one modality is not applicable - e.g. eyes or hands are busy, silent environment –or when difficult to use - e.g. small devices with limited keyboard and small screen Multi Modal Dialogue Systems
38 Comic Companions Case Study
39 The Comic Avatar
40 Wizard of Oz
41 Putting it together
42 The Companions Architecture
43 The Companions Robot
44 The Companions Interface 1
45 The Companions Interface 2
46 Questions?