Use of FCA in the Ontology Extraction Step for the Improvement of the Semantic Information Retrieval Peter Butka TU Košice, Slovakia.

Slides:



Advertisements
Similar presentations
ISDSI 2009 Francesco Guerra– Università di Modena e Reggio Emilia 1 DB unimo Searching for data and services F. Guerra 1, A. Maurino 2, M. Palmonari.
Advertisements

Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.
Unsupervised Ontology Acquisition from plain texts: The OntoGain method Efthymios Drymonas Kalliopi Zervanou Euripides G.M. Petrakis Intelligent Systems.
Web search results clustering Web search results clustering is a version of document clustering, but… Billions of pages Constantly changing Data mainly.
Person Name Disambiguation by Bootstrapping Presenter: Lijie Zhang Advisor: Weining Zhang.
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
Distributional Clustering of Words for Text Classification Authors: L.Douglas Baker Andrew Kachites McCallum Presenter: Yihong Ding.
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation 2006 Spring Research Conference Yihong Ding.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
Distributional Clustering of Words for Text Classification L. Douglas Baker Andrew Kachites McCallum SIGIR’98.
Visualization of AAG Paper Abstracts André Skupin Dept. of Geography University of New Orleans AAG Pittsburgh, April 5, 2000.
Disambiguation Algorithm for People Search on the Web Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen, Rabia Nuray-Turan, Naveen Ashish For questions.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Clustering Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential Chair of.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dörre, Peter Gerstl, and Roland Seiffert Presented By: Jake Happs,
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
There are 5 dimensions we need to consider to characterise the next version of ASPL –New Services E.g., impact analysis –Ontologies Domain, argumentation,
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Semantic, Hierarchical, Online Clustering of Web Search Results Yisheng Dong.
Document Clustering 文件分類 林頌堅 世新大學圖書資訊學系 Sung-Chien Lin Department of Library and Information Studies Shih-Hsin University.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
A Declarative Similarity Framework for Knowledge Intensive CBR by Díaz-Agudo and González-Calero Presented by Ida Sofie G Stenerud 25.October 2006.
Text mining. The Standard Data Mining process Text Mining Machine learning on text data Text Data mining Text analysis Part of Web mining Typical tasks.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
7 th Workshop on Intelligent and Knowledge Oriented Technologies Smolenice WIKT 2012 Reduction of Computation Times of GOSCL Algorithm Using.
Clustering I. 2 The Task Input: Collection of instances –No special class label attribute! Output: Clusters (Groups) of instances where members of a cluster.
Prepared by: Mahmoud Rafeek Al-Farra
Some questions -What is metadata? -Data about data.
Data mining, interactive semantic structuring, and collaboration: A diversity-aware method for sense-making in search Mathias Verbeke, Bettina Berendt,
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
INFORMATION RETRIEVAL PROJECT Creation of clusters of concepts that represent a domain corpus.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Text Clustering Hongning Wang
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Organizing Structured Web Sources by Query Schemas: A Clustering Approach Bin He Joint work with: Tao Tao, Kevin Chen-Chuan Chang Univ. Illinois at Urbana-Champaign.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Of 24 lecture 11: ontology – mediation, merging & aligning.
Nearest Neighbour and Clustering. Nearest Neighbour and clustering Clustering and nearest neighbour prediction technique was one of the oldest techniques.
Data Mining and Text Mining. The Standard Data Mining process.
11 Thoughts on STS regarding Machine Reading Ralph Weischedel 12 March 2012.
Information Organization: Overview
Information Organization
Information Retrieval and Web Search
Information Retrieval and Web Search
What is IR? In the 70’s and 80’s, much of the research focused on document retrieval In 90’s TREC reinforced the view that IR = document retrieval Document.
Information Retrieval and Web Search
CSE572, CBS598: Data Mining by H. Liu
Presented by: Prof. Ali Jaoua
Text Categorization Document classification categorizes documents into one or more classes which is useful in Information Retrieval (IR). IR is the task.
Data Mining Chapter 6 Search Engines
Qingxia Liu Interactive Hierarchical Tag Clouds for Summarizing Spatiotemporal Social Contents [ICDE 2014] Kang, Wei, Anthony KH Tung,
Text Categorization Berlin Chen 2003 Reference:
Facilitating Navigation on Linked Data through Top-K Link Patterns
CSCI 130 Classes and Objects.
State of the Art Ontology Mapping
Feature mapping: Self-organizing Maps
Semi-Automatic Data-Driven Ontology Construction System
Information Organization: Overview
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Information Retrieval and Web Search
Presentation transcript:

Use of FCA in the Ontology Extraction Step for the Improvement of the Semantic Information Retrieval Peter Butka TU Košice, Slovakia Semantic Web Environment and Retrieval Tasks Information retrieval improvement of unknown set of text documents –Preprocessing of documents set –Building of the ontology Creating of concept hierarchy Finding of relations between concepts Extraction of instances (ontology population) –Using of created ontology and instances for the improvement of IR Unknown set of documents “Classic” indexing methods Preprocessing steps Building of ontology IR task Ontology- based IR comb.

Use of FCA and labeling of concepts Formal Concept Analysis –Explorative method for data analysis –Concept lattice Concept is cluster of “similar” objects (similarity is based on presence of same attributes) Concepts are hierarchically organized (specific vs. general) Use of FCA on texts –Output – one-sided fuzzy concept lattice –Clustering via concepts (agglomerative) –Interpretation Use of LabelSOM method for improving of interpretability of concepts (clusters) Concept (467): IntSet 467(|467| = 2) = {6 59} Labels (467): indian , provinc , negoti , nehru , kashmir , india , pakistan

FCA and texts –interpretability problems –time-consuming Solution – problem reduction => e.g. use of clustering algorithms –Pre-clustering of document set (e.g. Hierarchical SOM) –Creation of ontology parts from smaller sets (FCA) –Merging of small models to complete ontology Possible use of reduction approach to ontology creation step Starting set of documents Final Merged Ontology C1C1 O1O1 C2 C2 O2O2 CnCn OnOn Clustering phase Ontology parts creating phase Merging phase...