Download presentation
Presentation is loading. Please wait.
Published byCurtis Wade Modified over 9 years ago
1
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute http://ontogen.ijs.si ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR
2
Outline Motivation Functionality Conclusion HCII2007, July 26th 2 Blaz Fortuna, Jozef Stefan Institute, Slovenia
3
Motivation HCII2007, July 26th 3 Blaz Fortuna, Jozef Stefan Institute, Slovenia
4
What is ontology? Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts. Generally it consist of Classes: sets, collections, or types of objects Instances: the basic or "ground level" objects Relations: ways that objects can be related to one another It can be used … as schema for knowledge management system, … to reason about the objects within that domain, etc. HCII2007, July 26th 4 Blaz Fortuna, Jozef Stefan Institute, Slovenia
5
Sample Ontology HCII2007, July 26th 5 Blaz Fortuna, Jozef Stefan Institute, Slovenia
6
Ontology is normally designed by knowledge engineers using ontology editors: Protégé, OntoStudio, … Domain experts are needed to aid the knowledge engineer at the understanding the domain Ontology editors are not aware of the ontology’s domain Our goal is to make ontology editor easy-to-use and domain-aware so that it can be used by domain experts. Reduces the need for knowledge engineer This is done through the use of text mining and machine learning. In this presentation we focus on construction of Topic Ontologies Ontology Editor Creating Ontology HCII2007, July 26th 6 Blaz Fortuna, Jozef Stefan Institute, Slovenia Domain Expert Knowledge Engineer Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. …
7
How does it work? OntoGen suggests concepts Suggestions are generated automatically … from the text corpus by clustering similar documents … based on user query … through text corpus map User selects appropriate suggestions and adds them to the ontology OntoGen helps deciding which suggestions to include … by extracting main keywords from the documents … with ontology and concept visualizations … by list documents behind concepts Behind each concept there is a set of documents Documents are automatically assigned to concepts Document assignments can be edited manually HCII2007, July 26th 7 Blaz Fortuna, Jozef Stefan Institute, Slovenia
8
Example Domain Text corpusOntology Concept AConcept B Concept C HCII2007, July 26th 8 Blaz Fortuna, Jozef Stefan Institute, Slovenia
9
Functionality HCII2007, July 26th 9 Blaz Fortuna, Jozef Stefan Institute, Slovenia
10
Main Features Interactive user interface User can interact in real-time with the integrated machine learning and text mining methods Concept discovery methods: Unsupervised System provides suggestions Supervised Concept learning Concept visualization Methods for helping at understanding the discovered concepts: Keyword extraction Generates a list of characteristic keywords of a given concept Concept visualization Creates a map of documents from a given concept Also available as a separate tool named Document Atlas http://docatlas.ijs.si HCII2007, July 26th 10 Blaz Fortuna, Jozef Stefan Institute, Slovenia
11
Main view Concept hierarchy List of suggested sub-concepts Ontology visualization Selected concept 11
12
Concept suggestion Selected concept 12 Suggested subconcepts Add new concept New concept HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia
13
Personalized suggestions 13 Topics view Countries view UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia
14
Concept learning 14 Query New Concept Finish HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia
15
Concept’s instances visualization 15 Instances are visualized as points on 2D map The distance between two instances on the map correspond to their content similarity Characteristic keywords are shown for all parts of the map User can select groups of instances on the map to create sub-concepts. HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia
16
Concept management Concept’s details Concept’s instance management Selected concept Keywords Selected instance 16
17
New documents Classification of selected document Content of selected document Adding new documents to ontology HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 17 Selected document
18
Conclusions HCII2007, July 26th 18 Blaz Fortuna, Jozef Stefan Institute, Slovenia
19
Evaluation HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 19 First prototype was successfully used in several commercial projects: Applied in multiple domains: business, legislations and digital libraries Users were always domain experts with limited knowledge and experience with ontology construction / knowledge engineering Valuable data from first trails was used as input for the interface design of the second prototype (the one presented here). Feedback from the users of the second prototype Main impression was that the tool saves time and is especially useful when working with large collections of documents Among main disadvantages were abstraction and unattractive look Many users use the program for exploration of the data
20
Future work HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 20 Tools for suggestion and learning of more complex relations Extended support for collaborative editing of ontologies Easier input of background knowledge Improvement of the user interface based on the feedback from user trails and real-world users
21
Questions? Comments? Thank you for listening! HCII2007, July 26th 21 Blaz Fortuna, Jozef Stefan Institute, Slovenia http://ontogen.ijs.si
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.