Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University.

Slides:



Advertisements
Similar presentations
Semantic Memory for Avatars in Cyberspace Julian Szymański, Tomasz Sarnatowicz, Włodzisław Duch Department of Informatics, Nicolaus Copernicus University,
Advertisements

Relevance Feedback Limitations –Must yield result within at most 3-4 iterations –Users will likely terminate the process sooner –User may get irritated.
Universal Learning Machines (ULM) Włodzisław Duch and Tomasz Maszczyk Department of Informatics, Nicolaus Copernicus University, Toruń, Poland ICONIP 2009,
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Concept Description Vectors and the 20 Questions Game Włodzisław Duch Tomasz Sarnatowicz Julian Szymański.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Basic IR: Queries Query is statement of user’s information need. Index is designed to map queries to likely to be relevant documents. Query type, content,
Xyleme A Dynamic Warehouse for XML Data of the Web.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Knowing Semantic memory.
Automatic Discovery of Shared Interest Minimum Spanning Trees Displaying Semantic Similarity Włodzisław Duch & Co Department of Informatics, Nicolaus Copernicus.
6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.
Semantic Memory Architecture for Knowledge Acquisition and Management Włodzisław Duch Julian Szymański.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Semantic Memory Knowledge Acquisition Through Active Dialogues Włodzisław Duch, Julian Szymański The knowledge representation using relations between concepts.
Computer comunication B Information retrieval Repetition Retrieval models Wildcards Web information retrieval Digital libraries.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and.
Representation of hypertext documents based on terms, links and text compressibility Julian Szymański Department of Computer Systems Architecture, Gdańsk.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
General Knowledge Dr. Claudia J. Stanny EXP 4507 Memory & Cognition Spring 2009.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Steps Toward an AGI Roadmap Włodek Duch ( Google: W. Duch) AGI, Memphis, 1-2 March 2007 Roadmaps: A Ten Year Roadmap to Machines with Common Sense (Push.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Knowledge representation
Artificial intelligence project
Computational Linguistics WTLAB ( Web Technology Laboratory ) Mohsen Kamyar.
LIS510 lecture 3 Thomas Krichel information storage & retrieval this area is now more know as information retrieval when I dealt with it I.
Artificial Intelligence
Which of the two appears simple to you? 1 2.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
1 LING 696B: Midterm review: parametric and non-parametric inductive inference.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Neurolinguistic Approach to Vector Representation of Medical Concepts
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
Visualization in Text Information Retrieval Ben Houston Exocortex Technologies Zack Jacobson CAC.
Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
MIT Artificial Intelligence Laboratory — Research Directions The START Information Access System Boris Katz
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Semantic Wiki: Automating the Read, Write, and Reporting functions Chuck Rehberg, Semantic Insights.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Automatic Discovery of Shared Interest Minimum Spanning Trees Displaying Semantic Similarity Włodzisław Duch & Co Department of Informatics, Nicolaus.
Concept Description Vectors and the 20 Questions Game
Semantic Memory for Avatars in Cyberspace
Presentation transcript:

Knowledge Representation & Acquisition for Large-Scale Semantic Memory Julian Szymański Dept. of Electronic, Telecommunication & Informatics, Gdańsk University of Technology, Poland, Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland WCCI 2008

Plan & main points Goal: Reaching human-level competence in all aspects of NLP. Well... step by step. Representation of semantic concepts is necessary for the understanding of natural languages by cognitive systems. Word games - opportunity for semantic knowledge acquisition that may be used to construct semantic memory. A task-dependent architecture of the knowledge base inspired by psycholinguistic theories of cognition process is introduced. Semantic search algorithm for simplified concept vector reps. 20 questions game based on semantic memory implemented. Good test for linguistic competence of the system. A web portal with Haptek-based talking head interface facilitates acquisition of a new knowledge while playing the game and engaging in dialogs with users.

Humanized interface Store Applications, search, 20 questions game. Query Semantic memory Parser Part of speech tagger & phrase extractor On line dictionaries Active search and dialogues with users Manual verification

Ambitious approaches… CYC, Douglas Lenat, started in Developed by CyCorp, with 2.5 millions of assertions linking over concepts and using thousands of micro-theories (2004). Cyc-NL is still a “potential application”, knowledge representation in frames is quite complicated and thus difficult to use. Open Mind Common Sense Project (MIT): a WWW collaboration with over 14,000 authors, who contributed 710,000 sentences; used to generate ConceptNet, very large semantic network. Other such projects: HowNet (Chinese Academy of Science), FrameNet (Berkley), various large-scale ontologies. The focus of these projects is to understand all relations in text/dialogue. NLP is hard and messy! Many people lost their hope that without deep embodiment we shall create good NLP systems. Go the brain way! How does the brain do it?

Semantic Memory Models Endel Tulving „Episodic and Semantic Memory” Semantic memory refers to the memory of meanings and understandings. It stores concept-based, generic, context-free knowledge. Permanent container for general knowledge (facts, ideas, words etc). Semantic network Collins & Loftus, 1975 Hierarchical Model Collins & Quillian, 1969

Words in the brain Psycholinguistic experiments show that most likely categorical, phonological representations are used, not the acoustic input. Acoustic signal => phoneme => words => semantic concepts. Phonological processing precedes semantic by 90 ms (from N200 ERPs). F. Pulvermuller (2003) The Neuroscience of Language. On Brain Circuits of Words and Serial Order. Cambridge University Press. Phonological neighborhood density = the number of words that are similar in sound to a target word. Similar = similar pattern of brain activations. Semantic neighborhood density = the number of words that are similar in meaning to a target word. Action-perception networks inferred from ERP and fMRI

Semantic => vector reps Word w in the context:  (w,Cont), distribution of brain activations. States  (w,Cont)  lexicographical meanings: clusterize  (w,Cont) for all contexts, define prototypes  (w k,Cont) for different meanings w k. Simplification: use spreading activation in semantic networks to define . How does the activation flow? Try this algorithm on collection of texts: Perform text pre-processing steps: stemming, stop-list, spell-checking... Perform text pre-processing steps: stemming, stop-list, spell-checking... Use MetaMap with a very restrictive settings to discover concepts, avoiding highly ambiguous results when mapping text to UMLS ontology. Use MetaMap with a very restrictive settings to discover concepts, avoiding highly ambiguous results when mapping text to UMLS ontology. Use UMLS relations to create first-order cosets (terms + all new terms from included relations); add only those types of relations that lead to improvement of classification results. Use UMLS relations to create first-order cosets (terms + all new terms from included relations); add only those types of relations that lead to improvement of classification results. Reduce dimensionality of the first-order coset space, leave all original features; use feature ranking method for this reduction. Reduce dimensionality of the first-order coset space, leave all original features; use feature ranking method for this reduction. Repeat last two steps iteratively to create second- and higher-order enhanced spaces, first expanding, then shrinking the space. Repeat last two steps iteratively to create second- and higher-order enhanced spaces, first expanding, then shrinking the space. Create X vectors representing concepts.

Semantic knowledge representation vwCRK: certainty – truth – Concept Relation Keyword Similar to RDF in semantic web. Cobra is_aanimal is_abeast is_abeing is_abrute is_acreature is_afauna is_aorganism is_areptile is_aserpent is_asnake is_avertebrate hasbelly hasbody part hascell haschest hascosta Simplest rep. for massive evaluation/association: CDV – Concept Description Vectors, forming Semantic Matrix

Relations IS_A: specific features from more general objects. Inherited features with w from superior relations; v decreased by 10% + corrected during interaction with user. Similar: defines objects which share features with each other; acquire new knowledge from similar objects through swapping of unknown features with given certainty factors. Excludes: exchange some unknown features, but reverse the sign of w weights. Entail: analogical to the logical implication, one feature automatically entails a few more features (connected via the entail relation). Atom of knowledge contains strength and the direction of relations between concepts and keywords coming from 3 components: directly entered into the knowledge base; deduced using predefined relation types from stored information; obtained during system's interaction with the human user.

20q for semantic data acquisition Play 20 questions with Avatar! Think about animal – system tries to guess it, asking no more than 20 questions that should be answered only with Yes or No. Think about animal – system tries to guess it, asking no more than 20 questions that should be answered only with Yes or No. Given answers narrows the subspace of the most probable objects. System learns from the games – obtains new knowledge from interaction with the human users. Is it vertebrate? Y Is it mammal? Y Does it have hoof? Y Is it equine? N Is it bovine? N Does it have horn? N Does it have long neck? Y I guess it is giraffe.

Algorithm for 20 questions game p(keyword=v i ) is fraction of concepts for which the keyword has value vi Subspace of candidate concepts O(A) is selected: O (A) = {i; d=|CDVi-ANS | is minimal } where CDV i is a vector for i-concept, ANS is a partial vector of retrieved answers we can deal with user mistakes choosing d > minimal

Automatic data acquisition Basic semantic data obtained from aggregation of machine readable dictionaries: Wordnet, ConceptNet, Sumo Ontology –Used relations for semantic category: animal –Semantic space truncated using word popularity rank: IC – information content is an amount of appearances of the particular word in WordNet descriptions. GR - GoogleRank is an amount of web pages returned by Google search engine for a given word.IC – information content is an amount of appearances of the particular word in WordNet descriptions. GR - GoogleRank is an amount of web pages returned by Google search engine for a given word. BNC - are the words statistics taken from British National Corpus.BNC - are the words statistics taken from British National Corpus. Initial semantic space reduced to 94 objects and 72 featuresInitial semantic space reduced to 94 objects and 72 features

Human interaction knowledge acquisition Data obtained from machine readable dictionaries: –Not complete –Not Common Sense –Sometimes specialized concepts –Some errors Knowledge correction in the semantic space: W 0 – initial weight, initial knowledge (from dictionaries) ANS – answer given by user N – number of answers β - parameter indicating importance of initial knowledge

Active Dialogues Dialogues with the user for obtaining new knowledge/features: While system fails guess the object: I give up. Tell me what did you think of? The concepts used in the game corrects the semantic space While two concepts has the same CDV Tell me what is characteristic for ? The new keywords for specified concepts are stored in semantic memory While system needs more knowledge for same concept: I don’t have any particular knowledge about. Tell me more about. System obtains new keywords for a given concept.

Experiments in animal domain WordNet, ConceptNet, SumoMilo ontology + MindNet project as knowledge sources; added to SM only if it appears in at least 2 sources. Basic space: 172 objects, 475 features, 5031 relations. # features/concept = CDV density. Initial CDV density = 29, adding IS_A relations =41, adding similar, entails, excludes=46. Quality Q = N S /N = #searches with success/# all searches. Error E = 1-Q = 1-N S /N. For 10 concepts selected with #features close to the average. Q~0.8, after 5 repetition E ~ 18%, so some learning is needed.

Quality measures Initial semantic space: average # of games for correct recognition ~2.8. This depends on the number of semantic neighbors close to this concept. Completeness of concept representation: is CDV description sufficient to win the game? how far is it from the golden standard (manually created)? 4 measures of the concept description quality: S d = N f (GS)–N f (O) = #Golden Standard features - #features in O. how many features are still missing compared to the golden standard. S GS =  i [1–  (CDV i (GS),CDV i (O))] similarity based on co-occurrence. S NO = #features in O but not found in GS (reverse of S GS ). Dif w =  i [|CDV i (O)–CDV i (GS)|/m, average difference of O and GS.

Learning from games Select O randomly with preference for larger # features, p~exp(-N(O)/N)) N(O) = #features in O, and N = total number of features, Learning procedure: CDV(O) representation of the chosen concept O is inspected, and if necessary corrected. CDV(O) is removed from the memory. Try to learn the concept O by playing the 20 questions game. Average results for 5 test objects as a function of # games shown. NO  = S NO + S GS graph showing the average growth of the number of features as a function of the number of games played. Randomization of questions helps to find different features in each game. Average number of games to learn selected concepts N f =2.7. After the first successful game when a particular concept has been correctly recognized it was always found properly. After 4 games only a few new features are added.

Few conclusions Complex knowledge in frames is not too useful for large-scale search. Semantic search requires extensive knowledge. We do not have even the simplest common sense knowledge description, in many applications such representations are sufficient. It should be easier to generate this knowledge rather than wait for embodied systems. Semantic memory built from parsing dictionaries, encyclopedias, ontologies, results of collaborative projects. Active search is used to assign features found for concepts that are not far in ontology (for example, have same parents). Large-scale effort to create a numerical version of Wordnet for general applications is necessary, specialized knowledge is also important. Word games may help to create and correct some knowledge. 20Q is easier than Turing text, good intermediate step. Time for word games Olympics!

Thank you for lending your ear... Google: W. Duch => Papers, talks