A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
2806 Neural Computation Self-Organizing Maps Lecture Ari Visa.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
November 18, 2010Neural Networks Lecture 18: Applications of SOMs 1 Assignment #3 Question 2 Regarding your cascade correlation projects, here are a few.
Self Organization of a Massive Document Collection
Neural Networks Chapter 9 Joost N. Kok Universiteit Leiden.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Kohonen Self Organising Maps Michael J. Watts
Artificial neural networks:
Unsupervised Networks Closely related to clustering Do not require target outputs for each input vector in the training data Inputs are connected to a.
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
X0 xn w0 wn o Threshold units SOM.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Motion Map: Image-based Retrieval and Segmentation of Motion Data EG SCA ’ 04 學生 : 林家如
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
Switch to Top-down Top-down or move-to-nearest Partition documents into ‘k’ clusters Two variants “Hard” (0/1) assignment of documents to clusters “soft”
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Lecture 09 Clustering-based Learning
Data Mining Techniques
 C. C. Hung, H. Ijaz, E. Jung, and B.-C. Kuo # School of Computing and Software Engineering Southern Polytechnic State University, Marietta, Georgia USA.
CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.
Project reminder Deadline: Monday :00 Prepare 10 minutes long pesentation (in Czech/Slovak), which you’ll present on Wednesday during.
KOHONEN SELF ORGANISING MAP SEMINAR BY M.V.MAHENDRAN., Reg no: III SEM, M.E., Control And Instrumentation Engg.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Self Organized Map (SOM)
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Self-organizing Maps Kevin Pang. Goal Research SOMs Research SOMs Create an introductory tutorial on the algorithm Create an introductory tutorial on.
AuthorLink: Instant Author Co-Citation Mapping for Online Searching Xia Lin Howard D. White Jan Buzydlowski Drexel University Philadelphia,
Artificial Neural Network Unsupervised Learning
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
NEURAL NETWORKS FOR DATA MINING
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University.
Self Organization of a Massive Document Collection Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Teuvo Kohonen et al.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
1 CS 430: Information Discovery Lecture 25 Cluster Analysis 2 Thesaurus Construction.
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
UNSUPERVISED LEARNING NETWORKS
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
Self-organizing maps applied to information retrieval of dissertations and theses from BDTD-UFPE Bruno Pinheiro Renato Correa
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Nonlinear Mapping for Data Structure Analysis John W.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Self-Organizing Network Model (SOM) Session 11
Data Mining, Neural Network and Genetic Programming
Other Applications of Energy Minimzation
Proceedings of Infoviz’95
Counter propagation network (CPN) (§ 5.3)
Lecture 22 Clustering (3).
Self-organizing map numeric vectors and sequence motifs
Feature mapping: Self-organizing Maps
Restructuring Sparse High Dimensional Data for Effective Retrieval
Presentation transcript:

A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting

Outline Introduction Kohonen’s feature map A Self-Organizing Semantic Map for AI Literature A prototype system Future research

Introduction Discoveries of new learning algorithms in artificial neural networks. An application of unsupervised learning method,Kohonen’s feature map algorithm. Doyle’s classic article “Semantic road maps for literature searchers”.(1961) Three most distinguishing characteristics of Kohonen’s feature map :  frequencies and distributions of underlying input data.  Understanding of the computer’s role.  Projection.

Kohonen ’ s feature map A unsupervised learning methoh. A self-organizing learning algorithm which presumably produces feature maps similar to those occurring in the brain. Each Input object is represented by a N- dimensional vector, called a feature. Maps input objects onto of a two-dimensional grid.

Kohonen ’ s feature map

Each node in the grid is assigned a N- dimensional vector. Learning process :  Select an input vector randomly.  Find the node whose weights closest to the input vector.  Adjust the weights of the winning node.  Adjust the weights of the nodes close to the winning node.

Kohonen ’ s feature map A projection of the input space onto the two- dimensional grid. Two properties of such a feature map :  The feature map preserves the distance relationships between the input data as faithfully as possible.  The feature map allocates different numbers of nodes to inputs based on their occurrence frequencies.

Kohonen ’ s feature map Two basic procedure : selecting a winning node 、 updating weights. The winning node is selected based on the Euclidian distance. Let X(t)={X 1 (t),X 2 (t),…..,X N (t)} be the input vector selected at time t be weights for node k at time t. the winning node s is selected so that :

Kohonen ’ s feature map weights of s and the weights of the node in a defined neighborhood are adjusted by where is a gain term ( ) that decreases in time and converges to 0. two control mechanisms are imposed :  To shrink the neighborhood of a node gradually over time.  Adaptive gain parameter

Kohonen ’ s feature map gain parameter :  Kohonen initially defined it over geographic neighborhoods.  A more recent version adapts the Gaussian function.  The Gaussian function is supposed to describe a more natural mapping so as to help the algorithm converge in a more stable manner.

A Self-Organizing Semantic Map for AI Literature Kononen’s self-organizing map algorithm is applied to a collection of documents from the LISA database. All 140 titles indexed by the descriptor “Artificial Intelligence”. And 25 words were retained. A document vector contains 1 in a given column if the corresponding word occurs in the document title and 0 otherwise. The document vectors are used as input to train a feature map of 25 features and 140 nodes.

A Self-Organizing Semantic Map for AI Literature

Fig.3. is the semantic map of documents obtained after 2500 training cycles. The map contains very rich information :  These numbers collectively reveal the distribution of documents on the map.  The map is divided into concept areas.

A Self-Organizing Semantic Map for AI Literature The areas to which a node belongs is determined as follows :  Way 1- compare the node to every unit vector and assign to the node the unit vector.  Way 2- compare each init vector to every node and label the winning node with the word corresponding to the unit.  When two words fall into the same area, the areas are merged.

A Self-Organizing Semantic Map for AI Literature The size of the areas corresponds to the frequencies of occurrence of the words. The relationship between frequencies of word co- occurrence and neighbor areas is also apparent. Other properties of the map are particularly related to the nature of the document space.  The final weight patterns could be a good associative indexing scheme for the collection.  Related to descriptors in a thesaurus.

A Self-Organizing Semantic Map for AI Literature Comparing the self-organizing semantic map with Doyle’s semantic road map.  Doyle’s map was basically a demonstration, rather than a practical implementation.  The self-organizing map reveals the frequencies and distributions of underlying data.  The self-organizing map allows much flesxibility.

A prototype system A prototype system in HyperCard has been designed using a semantic map as an interface. The system is designed to accept downloaded bibliographical data and automatically add links among the data for easy browsing.

The first card of the system contains a self-organizing semantic map

These selections will cause the system to display titles associated with the selected nodes

To see the full record

A prototype system The design goal is to conceptualized an information retrieval approach which uses traditional search techniques as information filters and the semantic map as a browsing aid. To enhance post-processing the retrieved set. Provided the user an environment by the simple, high recall-oriented.

Future research Several promising features :  It maps a high dimension document space to a two-dimensional map while maintaining as faithfully as possible document inter-relationships.  It visualizes the underlying structure of a documents space.  It results in an indexing scheme that show the potential to compensate for incompleteness of a simple document representation.

Future research Dimension reduction :  Latent semantic indexing (LSI).  What are the dimensions for the document space? Automated generation of dimensions from word stem frequencies as appearing in titles. The document space can be defined assigned descriptors as dimensions. (descriptor assignment) The extended vector model to define a document space.

Future research Information visualization :  Crouch (1986) provided an overview of the graphical techniques used for visual display for IR. Clustering, multidimensional scaling, iconic representation, and his own component scale drawing. Most of these techniques were not appropriate in an interactive environment.  The self-organizing semantic map can be considered as another technique for visual display of information for IR.

Future research The human-generated semantic map :  The properties inherited from Kohonen’s feature map algorithm are promising for the purpose of IR.  But it is not clear yet what properties we may have if we apply some other learning algorithms to produce the semantic map.  Desire to have a semantic map that is independent of any algorithms.

Future research A further comparison of the maps generated by the algorithm and by people is to use the maps in a retrieval setting. Through a series of investigations, they should better understand the construction and the properties of the self-organizing semantic map.