Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

eClassifier: Tool for Taxonomies
Background Knowledge for Ontology Construction Blaž Fortuna, Marko Grobelnik, Dunja Mladenić, Institute Jožef Stefan, Slovenia.
AVATAR: Advanced Telematic Search of Audivisual Contents by Semantic Reasoning Yolanda Blanco Fernández Department of Telematic Engineering University.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
The Experience Factory May 2004 Leonardo Vaccaro.
Information and Business Work
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Interfaces for Selecting and Understanding Collections.
COMP 6703 eScience Project Commercial Wiki of Academic Journal  Student : Yin Chen  Client/Technical Supervisor : Mr Tom Worthington  Academic Supervisor.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 15Slide 1 User interface design l Designing effective interfaces for software systems.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Lecture-8/ T. Nouf Almujally
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Chapter 6 Teaching with Multimedia and Hypermedia
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news.
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
What is CMap tools? WELCOME! This is a module to give you an introduction on CMap. This knowledge may help you decide how CMap can help you in your work.
MLearn2011 A Brief Introduction of Learning Cell Wei Cheng Master Candidate of Educational Technology Knowledge Media Institute, Beijing.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Teaching with Multimedia and Hypermedia
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 6 Slide 1 Chapter 6 Requirements Engineering Process.
Pascal Visualization Challenge Blaž Fortuna, IJS Marko Grobelnik, IJS Steve Gunn, US.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Drive brand awareness. YouTube Promoted Videos YouTube Promoted Videos. Leveraging Your Video Assets.
Developing an Ontology for Irrigation Information Resources *Cornejo, C., H.W. Beck, D.Z. Haman, F.S. Zazueta. University of Florida Gainesville, FL. USA.
Funded by: European Commission – 6th Framework Project Reference: IST WP 2: Learning Web-service Domain Ontologies Miha Grčar Jožef Stefan.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
Case Study – Venture Portfolio Tracking and Competitive Intelligence Sam Knox - Director of Analyst Services Christopher Cho - Consulting Analyst.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Module 5 A system where in its parts perform a unified job of receiving inputs, processes the information and transforms the information into a new kind.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Semantic on the Social Semantic Desktop.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
Towards an Experience Management System at Fraunhofer Center for Experimental Software Engineering Maryland (FC-MD)
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Article by Dunja Mladenic, Marko Grobelnik, Blaz Fortuna, and Miha Grcar, Chapter 3 in Semantic Knowledge Management: Integrating Ontology Management,
Understanding User’s Query Intent with Wikipedia G 여 승 후.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Dalit Gasul Department of Geography and Environmental Studies, University of Haifa CRI-Project Review Day, Tuesday, February 26, 2008.
Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Discovering Computers 2011: Living in a Digital World Chapter 3
Objectives Overview Identify the four categories of application software Describe characteristics of a user interface Identify the key features of widely.
Application Software Chapter 6.
System for Semi-automatic ontology construction
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Final Exam i hope you will sucess write in your paper
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Topic Oriented Semi-supervised Document Clustering
Semi-Automatic Data-Driven Ontology Construction System
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
AI Discovery Template IBM Cloud Architecture Center
Presentation transcript:

Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR

Outline  Motivation  Functionality  Conclusion HCII2007, July 26th 2 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Motivation HCII2007, July 26th 3 Blaz Fortuna, Jozef Stefan Institute, Slovenia

What is ontology?  Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts.  Generally it consist of  Classes: sets, collections, or types of objects  Instances: the basic or "ground level" objects  Relations: ways that objects can be related to one another  It can be used  … as schema for knowledge management system,  … to reason about the objects within that domain,  etc. HCII2007, July 26th 4 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Sample Ontology HCII2007, July 26th 5 Blaz Fortuna, Jozef Stefan Institute, Slovenia

 Ontology is normally designed by knowledge engineers using ontology editors:  Protégé, OntoStudio, …  Domain experts are needed to aid the knowledge engineer at the understanding the domain  Ontology editors are not aware of the ontology’s domain  Our goal is to make ontology editor easy-to-use and domain-aware so that it can be used by domain experts.  Reduces the need for knowledge engineer  This is done through the use of text mining and machine learning.  In this presentation we focus on construction of Topic Ontologies Ontology Editor Creating Ontology HCII2007, July 26th 6 Blaz Fortuna, Jozef Stefan Institute, Slovenia Domain Expert Knowledge Engineer Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. …

How does it work?  OntoGen suggests concepts  Suggestions are generated automatically … from the text corpus by clustering similar documents … based on user query … through text corpus map  User selects appropriate suggestions and adds them to the ontology  OntoGen helps deciding which suggestions to include … by extracting main keywords from the documents … with ontology and concept visualizations … by list documents behind concepts  Behind each concept there is a set of documents  Documents are automatically assigned to concepts  Document assignments can be edited manually HCII2007, July 26th 7 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Example Domain Text corpusOntology Concept AConcept B Concept C HCII2007, July 26th 8 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Functionality HCII2007, July 26th 9 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Main Features  Interactive user interface  User can interact in real-time with the integrated machine learning and text mining methods  Concept discovery methods:  Unsupervised System provides suggestions  Supervised Concept learning Concept visualization  Methods for helping at understanding the discovered concepts:  Keyword extraction Generates a list of characteristic keywords of a given concept  Concept visualization Creates a map of documents from a given concept Also available as a separate tool named Document Atlas HCII2007, July 26th 10 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Main view Concept hierarchy List of suggested sub-concepts Ontology visualization Selected concept 11

Concept suggestion Selected concept 12 Suggested subconcepts Add new concept New concept HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

Personalized suggestions 13 Topics view Countries view UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

Concept learning 14 Query New Concept Finish HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

Concept’s instances visualization 15  Instances are visualized as points on 2D map  The distance between two instances on the map correspond to their content similarity  Characteristic keywords are shown for all parts of the map  User can select groups of instances on the map to create sub-concepts. HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

Concept management Concept’s details Concept’s instance management Selected concept Keywords Selected instance 16

New documents Classification of selected document Content of selected document Adding new documents to ontology HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 17 Selected document

Conclusions HCII2007, July 26th 18 Blaz Fortuna, Jozef Stefan Institute, Slovenia

Evaluation HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 19  First prototype was successfully used in several commercial projects:  Applied in multiple domains: business, legislations and digital libraries  Users were always domain experts with limited knowledge and experience with ontology construction / knowledge engineering  Valuable data from first trails was used as input for the interface design of the second prototype (the one presented here).  Feedback from the users of the second prototype  Main impression was that the tool saves time and is especially useful when working with large collections of documents  Among main disadvantages were abstraction and unattractive look  Many users use the program for exploration of the data

Future work HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 20  Tools for suggestion and learning of more complex relations  Extended support for collaborative editing of ontologies  Easier input of background knowledge  Improvement of the user interface based on the feedback from user trails and real-world users

Questions? Comments? Thank you for listening! HCII2007, July 26th 21 Blaz Fortuna, Jozef Stefan Institute, Slovenia