Presentation is loading. Please wait.

Presentation is loading. Please wait.

Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.

Similar presentations


Presentation on theme: "Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR."— Presentation transcript:

1 Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute http://ontogen.ijs.si ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR

2 Outline  Motivation  Functionality  Conclusion HCII2007, July 26th 2 Blaz Fortuna, Jozef Stefan Institute, Slovenia

3 Motivation HCII2007, July 26th 3 Blaz Fortuna, Jozef Stefan Institute, Slovenia

4 What is ontology?  Ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts.  Generally it consist of  Classes: sets, collections, or types of objects  Instances: the basic or "ground level" objects  Relations: ways that objects can be related to one another  It can be used  … as schema for knowledge management system,  … to reason about the objects within that domain,  etc. HCII2007, July 26th 4 Blaz Fortuna, Jozef Stefan Institute, Slovenia

5 Sample Ontology HCII2007, July 26th 5 Blaz Fortuna, Jozef Stefan Institute, Slovenia

6  Ontology is normally designed by knowledge engineers using ontology editors:  Protégé, OntoStudio, …  Domain experts are needed to aid the knowledge engineer at the understanding the domain  Ontology editors are not aware of the ontology’s domain  Our goal is to make ontology editor easy-to-use and domain-aware so that it can be used by domain experts.  Reduces the need for knowledge engineer  This is done through the use of text mining and machine learning.  In this presentation we focus on construction of Topic Ontologies Ontology Editor Creating Ontology HCII2007, July 26th 6 Blaz Fortuna, Jozef Stefan Institute, Slovenia Domain Expert Knowledge Engineer Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Xerox Xerox Corporation is a technology and services enterprise engaged in developing, manufacturing, marketing, servicing and financing a portfolio of document equipment, software, solutions and services. It manages its business in four segments: Production, Office, Developing Markets Operations (DMO) and Other. The Production segment includes black and white products, which operate at speeds over 90 pages per minute … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … Yahoo! Yahoo! Inc. is a provider of Internet products and services to consumers and businesses through the Yahoo! Network, its worldwide network of online properties. The Company's properties and services for consumers and businesses reside in four areas: Search and Marketplace, … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. … The Washington Post Company's principal business activities consist of newspaper publishing (principally The Washington Post), television broadcasting (through the ownership and operation of six television broadcast stations), the ownership and operation of cable television systems, magazine publishing (principally Newsweek magazine), and (through its Kaplan subsidiary) the provision of educational services. …

7 How does it work?  OntoGen suggests concepts  Suggestions are generated automatically … from the text corpus by clustering similar documents … based on user query … through text corpus map  User selects appropriate suggestions and adds them to the ontology  OntoGen helps deciding which suggestions to include … by extracting main keywords from the documents … with ontology and concept visualizations … by list documents behind concepts  Behind each concept there is a set of documents  Documents are automatically assigned to concepts  Document assignments can be edited manually HCII2007, July 26th 7 Blaz Fortuna, Jozef Stefan Institute, Slovenia

8 Example Domain Text corpusOntology Concept AConcept B Concept C HCII2007, July 26th 8 Blaz Fortuna, Jozef Stefan Institute, Slovenia

9 Functionality HCII2007, July 26th 9 Blaz Fortuna, Jozef Stefan Institute, Slovenia

10 Main Features  Interactive user interface  User can interact in real-time with the integrated machine learning and text mining methods  Concept discovery methods:  Unsupervised System provides suggestions  Supervised Concept learning Concept visualization  Methods for helping at understanding the discovered concepts:  Keyword extraction Generates a list of characteristic keywords of a given concept  Concept visualization Creates a map of documents from a given concept Also available as a separate tool named Document Atlas http://docatlas.ijs.si HCII2007, July 26th 10 Blaz Fortuna, Jozef Stefan Institute, Slovenia

11 Main view Concept hierarchy List of suggested sub-concepts Ontology visualization Selected concept 11

12 Concept suggestion Selected concept 12 Suggested subconcepts Add new concept New concept HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

13 Personalized suggestions 13 Topics view Countries view UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … UK takeovers and mergers The following are additions and deletions to the takeovers and mergers list for the week beginning August 19, as provided by the Takeover … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … Lloyd’s CEO questioned in recovery suit in U.S. Ronald Sandler, chief executive of Lloyd's of London, on Tuesday underwent a second day of court interrogation about … HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

14 Concept learning 14 Query New Concept Finish HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

15 Concept’s instances visualization 15  Instances are visualized as points on 2D map  The distance between two instances on the map correspond to their content similarity  Characteristic keywords are shown for all parts of the map  User can select groups of instances on the map to create sub-concepts. HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia

16 Concept management Concept’s details Concept’s instance management Selected concept Keywords Selected instance 16

17 New documents Classification of selected document Content of selected document Adding new documents to ontology HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 17 Selected document

18 Conclusions HCII2007, July 26th 18 Blaz Fortuna, Jozef Stefan Institute, Slovenia

19 Evaluation HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 19  First prototype was successfully used in several commercial projects:  Applied in multiple domains: business, legislations and digital libraries  Users were always domain experts with limited knowledge and experience with ontology construction / knowledge engineering  Valuable data from first trails was used as input for the interface design of the second prototype (the one presented here).  Feedback from the users of the second prototype  Main impression was that the tool saves time and is especially useful when working with large collections of documents  Among main disadvantages were abstraction and unattractive look  Many users use the program for exploration of the data

20 Future work HCII2007, July 26thBlaz Fortuna, Jozef Stefan Institute, Slovenia 20  Tools for suggestion and learning of more complex relations  Extended support for collaborative editing of ontologies  Easier input of background knowledge  Improvement of the user interface based on the feedback from user trails and real-world users

21 Questions? Comments? Thank you for listening! HCII2007, July 26th 21 Blaz Fortuna, Jozef Stefan Institute, Slovenia http://ontogen.ijs.si


Download ppt "Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR."

Similar presentations


Ads by Google