Semi-Automatic Data-Driven Ontology Construction System

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Background Knowledge for Ontology Construction Blaž Fortuna, Marko Grobelnik, Dunja Mladenić, Institute Jožef Stefan, Slovenia.
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Question Answering Based on Semantic Graphs Lorand Dali – Delia Rusu – Blaž Fortuna – Dunja Mladenić.
Miha Grčar (Department of Knowledge Technologies, Jožef Stefan Institute) & FIRST Consortium M12 scenario: Early prototype demo Luxembourg, Nov 2011.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Unsupervised learning
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
K nearest neighbor and Rocchio algorithm
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Text Classification With Support Vector Machines
Mapping Between Taxonomies Elena Eneva 27 Sep 2001 Advanced IR Seminar.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
Information Retrieval Ch Information retrieval Goal: Finding documents Search engines on the world wide web IR system characters Document collection.
Presented by Zeehasham Rasheed
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Text Classification With Labeled and Unlabeled Data Presenter: Aleksandar Milisic Supervisor: Dr. David Albrecht.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
Introduction to machine learning
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news.
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.
The use of machine translation tools for cross-lingual text-mining Blaz Fortuna Jozef Stefan Institute, Ljubljana John Shawe-Taylor Southampton University.
Funded by: European Commission – 6th Framework Project Reference: IST WP 2: Learning Web-service Domain Ontologies Miha Grčar Jožef Stefan.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
1 1 Why and how is this a “related document”?: Semantics-based analysis of and navigation through heterogeneous text corpora Bettina Berendt & Daniel Trümper.
1 LiveClassifier: Creating Hierarchical Text Classifiers through Web Corpora Chien-Chung Huang Shui-Lung Chuang Lee-Feng Chien Presented by: Vu LONG.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
Data Mining By Dave Maung.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Article by Dunja Mladenic, Marko Grobelnik, Blaz Fortuna, and Miha Grcar, Chapter 3 in Semantic Knowledge Management: Integrating Ontology Management,
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.
Competence Centre on Information Extraction and Image Understanding for Earth Observation 29th March 2007 Category - based Semantic Search Engine 1 Mihai.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Use of FCA in the Ontology Extraction Step for the Improvement of the Semantic Information Retrieval Peter Butka TU Košice, Slovakia.
Basic Machine Learning: Clustering CS 315 – Web Search and Data Mining 1.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
A Supervised Machine Learning Algorithm for Research Articles Leonidas Akritidis, Panayiotis Bozanis Dept. of Computer & Communication Engineering, University.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from.
Marko Grobelnik, Janez Brank, Blaž Fortuna, Igor Mozetič.
Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Data Mining and Text Mining. The Standard Data Mining process.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
String Kernels on Slovenian documents Blaž Fortuna Dunja Mladenić Marko Grobelnik.
Semi-Supervised Clustering
Machine Learning Clustering: K-means Supervised Learning
System for Semi-automatic ontology construction
Instance Based Learning
Constrained Clustering -Semi Supervised Clustering-
Multimedia Content-Based Retrieval
Information Retrieval and Web Search
Information Retrieval and Web Search
Project Implementation for ITCS4122
Machine Learning Week 1.
Show suggestions and borderlines Hierarchical Clustering
Text Categorization Assigning documents to a fixed set of categories
Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun
Text Mining Application Programming Chapter 9 Text Categorization
Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017
Presentation transcript:

Semi-Automatic Data-Driven Ontology Construction System Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute

Main features of OntoGen Semi-Automatic Text-mining methods provide suggestions and insights into the domain The user can interact with parameters of text-mining methods All the final decisions are taken by the user Data-Driven Most of the aid provided by the system is based on some underlying data provided by the system Instances are described by features extracted from the data (e.g. bag-of-words vectors)

OntoGen v1.0 Designed for construction of topic ontologies Clustering algorithms used for topic suggestion Keyword extractions methods help the user to name the concept Interactive user interface

OntoGen v2.0 Improved user interface New features: Based on the feedback from users New features: Active Learning Learning new concepts based on user queries and user classification of carefully selected documents Simultaneous Ontologies Optimization of similarity measure based on provided document categories Concept’s Instances Visualization Integration of Document Atlas visualization Ontology Population Interactive classification of new instances into ontology

Sub-Concept suggestion Concept hierarchy Sub-Concept suggestion Ontology visualization

Concept’s documents management Concept hierarchy Concept’s documents management Selected concept’s details

Active Learning SVM hyperplane distance based active learning algorithm First few labelled documents are bootstrapped using user query and nearest-neighbour search In each step the unlabeled document closest to the hyperplane is chosen for user classification

New Concept

Simultaneous Ontologies Data: Reuters news articles Each news is assigned two different sets of categories: Topics Countries Each set of categories offers a different view on the data Topics view Countries view Documents

Concept’s Instances Visualization

Ontology Population One vs. All linear SVM used classification Interactive user interface where user can finalize the classifications

Classification of the selected document New documents Classification of the selected document Selected document