D YNAMIC B UILDING OF D OMAIN S PECIFIC L EXICONS U SING E MERGENT S EMANTICS Final Presentation Matt Selway 100079967 Supervisor: Professor Markus Stumptner.

Slides:



Advertisements
Similar presentations
Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Advertisements

From Words to Meaning to Insight Julia Cretchley & Mike Neal.
1/1/ A Knowledge-based Approach to Citation Extraction Min-Yuh Day 1,2, Tzong-Han Tsai 1,3, Cheng-Lung Sung 1, Cheng-Wei Lee 1, Shih-Hung Wu 4, Chorng-Shyong.
CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Aki Hecht Seminar in Databases (236826) January 2009
Requirements Specification
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
1 Chapter 5: The F1ive Steps in Problem Analysis The five steps in problem analysis. Team Skill 1.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
4th June 2010IASSIST 2010 conference 1 APPLICATIONS OF SOCIAL NETWORKING IN INTERNATIONAL COLLABORATION, MULTISITE-RESEARCH, KNOWLEDGE RE-USE AND DATA.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Customer Focus Module Preview
PJSISSTA '001 Black-Box Test Reduction Using Input-Output Analysis ISSTA ‘00 Patrick J. Schroeder, Bogdan Korel Department of Computer Science Illinois.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Business Driven Technology Unit 4
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
© Yilmaz “Agent-Directed Simulation – Course Outline” 1 Course Outline Dr. Levent Yilmaz M&SNet: Auburn M&S Laboratory Computer Science &
Chapter 6 : Software Metrics
The Scientific Community Game for STEM Innovation and Education (STEM: Science, Technology, Engineering and Mathematics) Karl Lieberherr Ahmed Abdelmeged.
Assessing the Suitability of UML for Modeling Software Architectures Nenad Medvidovic Computer Science Department University of Southern California Los.
A Snapshot of public Web Services Prof: Dr.Jainguo Lu Presenting Group: Aktar-uz-zaman Mohit Sud.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
A N A RCHITECTURE AND A LGORITHMS FOR M ULTI -R UN C LUSTERING Rachsuda Jiamthapthaksin, Christoph F. Eick and Vadeerat Rinsurongkawong Computer Science.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
EVALUATING PAPERS KMS quality- Impact on Competitive Advantage Proceedings of the 41 st Hawaii International Conference on System Sciences
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Interfacing Registry Systems December 2000.
Distributed Aircraft Maintenance Environment - DAME DAME Workflow Advisor Max Ong University of Sheffield.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
10/31/20151 EASTERN MEDITERRANEAN UNIVERSITY COMPUTER ENGINEERING DEPARTMENT Presented By Duygu CELIK Supervised By Atilla ELCI Intelligent Semantic Web.
Second Line Intrusion Detection Using Personalization DISA Sponsored GWU-CS.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Scientific Annotation Middleware (SAM) Jim Myers, Elena Mendoza PNNL Al Geist, Jens Schwidder ORNL.
Intelligent Agents. 2 What is an Agent? The main point about agents is they are autonomous: capable of acting independently, exhibiting control over their.
Concept-based P2P Search How to find more relevant documents Ingmar Weber Max-Planck-Institute for Computer Science Joint work with Holger Bast Torino,
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Requirement engineering & Requirement tasks/Management. 1Prepared By:Jay A.Dave.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Sharing personal knowledge over the Semantic Web ● We call personal knowledge the knowledge that is developed and shared by the users while they solve.
How Clustering of Search Results Can Aid Taxonomy Building.
Developing GRID Applications GRACE Project
WP4 Models and Contents Quality Assessment
A Graphical Modeling Environment for the
University of Computer Studies, Mandalay
Restrict Range of Data Collection for Topic Trend Detection
A Dynamic System Analysis of Simultaneous Recurrent Neural Network
Chaitali Gupta, Madhusudhan Govindaraju
Dept. of Computation, UMIST
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Introduction to Search Engines
Presentation transcript:

D YNAMIC B UILDING OF D OMAIN S PECIFIC L EXICONS U SING E MERGENT S EMANTICS Final Presentation Matt Selway Supervisor: Professor Markus Stumptner Knowledge and Software Engineering Laboratory School of Computer and Information Science

C ONTENTS Motivations and Goals Research Questions Method Experiments and Results Summary and Conclusions Limitations and Future Work

M OTIVATIONS AND G OALS Kleiner et al. (2009) developed a very different approach to Natural Language Processing (NLP) Treat NLP as Model Transformation problem Utilise Configuration as a model transformation Model transformation is process of taking input models and creating output models from them Foundation of Model Driven Engineering Configuration is a constraint based searching technique In this case the constraints are conformance to the desired meta model

M OTIVATIONS AND G OALS Overview of Process (Kleiner et al. 2009) Method shows promising results However, requires use of predefined lexicon

M OTIVATIONS AND G OALS Issues for practical applications: 1. Can take a long time to manually build a complete lexicon, even for a Specific Domain 2. Predefined lexicon is static 3. Reduces level of automation

M OTIVATIONS AND G OALS Short-range Goals: 1. At least partially automated creation of domain specific lexicons directly from the input text and external resources to retrieve lexical data 2. Make updates a natural part of the system 3. Allow sharing/reuse of lexical information Long-range Goals: 1. Improve the automated analysis of specifications 2. Support research into semantic interoperability 3. Develop global agreement on lexicons/ontologies

R ESEARCH Q UESTIONS Can we reduce or eliminate the need to manually predefine a lexicon by dynamically building a lexicon based on the input text? How much of a reduction can be gained? How well does it work? (i.e. accuracy of retrieved data, how much data is automatically retrieved) What are its limitations?

M ETHOD Developed an experimental system Attempted to use emergent semantics and semiotic dynamics in a similar way to that described by Steels and Hanappe (2006) for the interoperability of collective information systems. They propose a multi-agent system that uses communication to arrive at an agreement on the meaning of the data, its tags, and its categories. They take advantage of the semiotic triad between data, tags, and categories in user taxonomies (e.g. Bookmarks in a web browser) Semiotic triad implies a meaningful relationship between its three components

M ETHOD Basic semiotic triad (Steels & Hanappe, 2006) Similarly there exists a semiotic triad between a word, its use, and the domain it is used in. Idea is that this triad can be used in dynamically developing domain specific lexicons between information agents.

M ETHOD (D ESIGN ) Multi-agent System Lexical information retrieved from other agents Initial data downloaded from online sources User feedback adjusts the retrieved data Agents update their lexicons and associations to lexicons based on user feedback (using semiotic relationship) Lots of changes indicates the agents are actually using different domains Few changes indicates updates to the lexicon in the same domain

M ETHOD (O NLINE S OURCES ) Surveyed online lexicons/ontologies (CYC, WordNet, EDR) and dictionaries (Oxford, ‘The Free Dictionary’, ‘Your Dictionary’) Excluded CYC, WordNet, EDR as not suitable Turned to standard online dictionaries Official dictionaries Oxford/Harvard not suitable (want money for access) Discovered the ‘The Free Dictionary’ Large number of entries Enough detail in definitions (Transitive/Intransitive Verbs, Definite/Indefinite Articles, etc.) Reasonably standard pages for parsing

M ETHOD (L EXICON )

M ETHOD (A GENT C OMMUNICATION )

E XPERIMENTS AND R ESULTS

S UMMARY AND C ONCLUSIONS It works! How well? High percentage of words had data retrieved, however, too much unnecessary data reduces the effectiveness Accuracy is impacted by many factors Incomplete/incorrect parsing of the web page Small SBVR specification sample SBVR keywords Believe it is worth pursuing and improving Fix parsing, use multiple sources Define keyword lexicons, dynamically generate rest Fill in gaps/cull using words with only one category Etc.

L IMITATIONS AND F UTURE W ORK Choice of dictionary Potentially use multiple data sources Joint words, i.e. most SBVR key words Implementation not perfect Parsing of the data source No synonyms Communication Protocol Errors in adjusting association strengths Strength adjustment values and threshold values used for lexicon classifiers need more research to find more appropriate values Etc.

R EFERENCES Kleiner, M, Albert, P & Bézivin, J 2009, ‘Configuring Models for (Controlled) Languages’, in Proceedings of the IJCAI–09 Workshop on Configuration (ConfWS–09), Pacadena, CA, USA, pp Farlex 2010, The Free Dictionary, viewed 11 September 2010,. Steels, L & Hanappe, P 2006, ‘Interoperability Through Emergent Semantics A Semiotic Dynamics Approach’, in Journal on Data Semantics VI, vol. 4090, Springer Berlin / Heidelberg, pp