Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions.

Slides:



Advertisements
Similar presentations
CLUSTERING.
Advertisements

Applications of one-class classification
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Summer 2011 Tuesday, 8/ No supposition seems to me more natural than that there is no process in the brain correlated with associating or with.
The Logic of Intelligence Pei Wang Department of Computer and Information Sciences Temple University.
1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.
Image Analysis Phases Image pre-processing –Noise suppression, linear and non-linear filters, deconvolution, etc. Image segmentation –Detection of objects.
Cognitive - knowledge.ppt © 2001 Laura Snodgrass, Ph.D.1 Knowledge Structure of semantic memory –relationships among concepts –organization of memory –memory.
Peter Gärdenfors Why must language be vague?. Philosophers since Leibniz have dreamt of a precise language Vagueness is a design feature of natural language.
Identity and search in social networks Presented by Pooja Deodhar Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman.
Object-Oriented Analysis and Design
Chapter 5: Linear Discriminant Functions
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Knowing Semantic memory.
Seminar /workshop on cognitive attainment ppt Dr Charles C. Chan 28 Sept 2001 Dr Charles C. Chan 28 Sept 2001 Assessing APSS Students Learning.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Physical Symbol System Hypothesis
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Conceptual modelling. Overview - what is the aim of the article? ”We build conceptual models in our heads to solve problems in our everyday life”… ”By.
Concepts & Categorization. Geometric (Spatial) Approach Many prototype and exemplar models assume that similarity is inversely related to distance in.
Cognitive Psychology, 2 nd Ed. Chapter 8 Semantic Memory.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
MATH – High School Common Core Vs Kansas Standards.
Basic Concepts The Unified Modeling Language (UML) SYSC System Analysis and Design.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Design Principles. Design Process 1. Define the problem 2. Research the project 3. Create thumbnails and roughs ◦ Thumbnail – small, fast sketches that.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Geometric Conceptual Spaces Ben Adams GEOG 288MR Spring 2008.
Knowledge representation
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
The changing face of face research Vicki Bruce School of Psychology Newcastle University.
Peter Gärdenfors & Massimo Warglien Using Conceptual Spaces
25th April 2006 Semantics & Ontologies in GI Services Semantic Similarity Measurement Martin Raubal
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Emotions: a computational semiotics perspective Rodrigo Gonçalves, Ricardo Gudwin, Fernando Gomide Electrical and Computer Engineering School (FEEC) State.
27th April 2006Semantics & Ontologies in GI Services Semantic similarity measurement in a wayfinding service Martin Raubal
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
An Instructable Connectionist/Control Architecture: Using Rule-Based Instructions to Accomplish Connectionist Learning in a Human Time Scale Presented.
Chapter 1 Introduction to Statistics. Statistical Methods Were developed to serve a purpose Were developed to serve a purpose The purpose for each statistical.
Digital Image Processing CCS331 Relationships of Pixel 1.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
FUNCTIONS AND MODELS 1. The fundamental objects that we deal with in calculus are functions.
ECE 638: Principles of Digital Color Imaging Systems Lecture 3: Trichromatic theory of color.
Basic Theory (for curve 01). 1.1 Points and Vectors  Real life methods for constructing curves and surfaces often start with points and vectors, which.
An Introduction to Scientific Research Methods in Geography Chapter 2: Fundamental Research Concepts.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
A Methodology for automatic retrieval of similarly shaped machinable components Mark Ascher - Dept of ECE.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Plan for Today’s Lecture(s)
What is cognitive psychology?
- photometric aspects of image formation gray level images
IB Assessments CRITERION!!!.
Philosophy of Mathematics 1: Geometry
Measuring Social Life: How Many? How Much? What Type?
School of Computer Science & Engineering
CSCTR – Session 6 Dana Retová
CSc4730/6730 Scientific Visualization
3D Transformation CS380: Computer Graphics Sung-Eui Yoon (윤성의)
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Topological Signatures For Fast Mobility Analysis
Latent Semantic Analysis
Presentation transcript:

Conceptual Spaces P.D. Bruza Information Ecology Project Distributed Systems Technology Centre Part 1: Fundamental notions

Opening remarks This tutorial is more about cognitive science than IR, is fragmented and offers a somewhat personal interpretation The content is drawn mostly from Gärdenfors Conceptual Spaces: The geometry of thought, MIT Press, Also driven by some personal intuition: – The model theory for IR should be rooted in cognitive semantics – How do you capture these computational semantics in a computational form and what can you do with them?

Gärdenfors point of departure How can representations (information) in a cognitive system be modelled in an appropriate way? – Symbolic perspective: representation via symbol, a cognitive system is described by a Turing machine (cognition = computation = symbol manipulation) – Associationist perspective: representation via associations between different kinds of information elements (e.g. connectionism – associations modelled by artificial neural networks)

The problem with the symbolic and associationist perspectives mechanisms of concept acquisition, which are paramount for the understanding of many cognitive phenomena, cannot be given a satisfactory treatment in any of these representational forms – Concept acquisition (learning) closely tied with similarity – Geometric representation: similarity can be modelled in a natural way

Gärdenfors cognitive model symbolic conceptual associationist (sub-conceptual) Propositional representation Geometric representation Connectionist representation

Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

Quality dimensions Represent various qualities of an object: – Temperature – Weight – Brightness – Pitch – Height – Width – Depth A distinction is made between scientific and phenomenal (psychological) dimensions

Quality dimensions (cont) Each quality dimension is endowed with certain geometrical structures (in some cases topological or ordering relations) Weight: isomorphic to non-negative reals 0

Quality dimensions may have a discrete geometric structure Discrete structure divides objects into disjoint classes Kinship relation: father, mother, sister etc, (geometric structure = discrete points) Even for discrete dimensions we can distinguish a rudimentary geometric structure t

Phenomenal vs. scientific interpretations of dimensions Phenomenal interpretation: dimensions originate from cognitive structures (perception, memories) of humans or other organisms – E.g. (height, width, depth), hue, pitch Scientific interpretation: dimensions are treated as part of a scientific theory – E.g., weight

Example: colour Hue- the particular shade of colour – Geometric structure: circle – Value: polar coordinate Chromaticity- the saturation of the colour; from grey to higher intensities – Geometric structure: segment of reals – Value: real number Brightness: black to white – Geometric structure: reals in [0,1] – Value: real number

Example: colour (hue, chromaticity, brightness) NB geometric structure allows phenomenologically complementary and opposite hues can be distinguished

Integral and separable dimensions Dimensions are integral if an object cannot be assigned a value in one dimension without giving it a value in another: – E.g. cannot distinguish hue without brightness, or pitch without loudness Dimensions that are not integral, are said to be separable Psychologically, integral and separable dimensions are assumed to differ in cross dimensional similarity – – integral dimensions are higher in cross-dimensional similarity than separable dimensions. – (This point will motivate how similarities in the conceptual space are calculated depending on whether dimensions are integral or separable. N.B. IR matching functions treat all dimensions equally)

Where do dimensions originate from? Scientific dimensions: tightly connected to the measurement methods used Psychological dimensions: – Some dimensions appear innate, or developed very early; e.g. inside/outside, dangerous/not-dangerous. (These appear to be pre- conscious) – Dimensions are necessary for learning – to make sense of blooming, buzzing, confusion. Dimensions are added by the learning process to expand the conceptual space: – E.g., young children have difficulty in identifying whether two objects differ w.r.t brightness or size, even though they can see the objects differ in some way. Both differentiation and dimensionalization occur throughout ones lifetime.

In summary, Quality dimensions are the building blocks of representations within an conceptual space Gärdenfors rebuttal of logical positivism: –Humans and other animals can represent the qualities of objects, for example, when planning an action, without presuming an internal language or another symbolic system in which these qualities are expressed. As a consequence, I claim that the quality dimensions of conceptual spaces are independent of symbolic representations and more fundamental than these

Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

Domains and conceptual space A domain is set of integral dimensions- a separable subspace (e.g., hue, chromaticity, brightness) A conceptual space is a collection of one or more domains – Cognitive structure is defined in terms of domains as it is assumed that an object can be ascribed certain properties independently of other properties Not all domains are assumed to be metric – a domain may be an ordering with no distance defined Domains are not independent, but may be correlated, e.g., the ripeness and colour domains co-vary in the space of fruits

Conceptual spaces outline Quality dimension Domain Conceptproperty Conceptual spaces are a framework for a number of empirical theories: concept formation, induction, semantics (Context) How can conceptual spaces be realized (e.g., for IR)

Properties and concepts: general idea A property is a region in a subspace (domain) A concept is based on several separable subspaces

Example property: red hue chromaticity brightness Criterion P: A natural property is a convex region of a domain (subspace) natural – those properties that are natural for the purposes of problem solving, planning, communicating, etc

Motivation for convex regions x y x y Convex Not convex x and y are points (objects) in the conceptual space If x and y both have property P, then any object between x and y is assumed to have property P

Remarks about Criterion P Criterion P: A natural property is a convex region of a domain (subspace) Assumption: Most properties expressed by simple words in natural languages can be analyzed as natural properties The semantics of the linguistic constituents (e.g. red) is severely constrained by the underlying conceptual space (I.e. no bleen) Criterion P provides an account of properties that is independent of both possible worlds and objects Strong connection between convex regions and prototype theory (categorization) (Easier to understand how inductive inferences are made)

Example concept: apple Apple = Criterion C: A natural concept is represented as a set of regions in a number of domains together with an assignment of salience weights to the domains and information about how the regions in the different domains are correlated

Concepts and inference (in passing) The salience of different domains determines which associations can be made, and which inferences can be triggered – Context: moving a piano – leads to association heavy More about this next time…..

How to model relevance: concept? TopicalityAbout my topic NoveltyUnique or the only source; familiar CurrencyUp-to-date QualityWell written, credible PresentationComprehensive Source aspectsProminent author Info aspectsTheoretical paper Appealenjoyable Table from Yuan, Belkin and Kim, ACM SIGIR 2002 Poster

How to model a document(s): ? An exosomantic memory is a computerized system that operates as an extension to human memory. Ideally, use of an exosomantic system would be transparent, so that finding information would seem the same as remembering it to the human user (B.C. Brookes, 1975) – To create computerized representations of data sets that are consistent with human perception of the data sets – To enable personalized relations to representations of data sets – To provide natural interfaces for interaction with exosomantic memory Newby, G. Cognitive space and information space. JASIST 52(12), 2001

Term = dimension Since many of the fundamental quality dimensions are determined by our perceptual mechanisms, there is a direct link between properties described by regions of such dimensions and perceptions (rats!) However, dimensional spaces based on terms have shown marked correlation with human information processing: – HAL and note (It is difficult to know how to encode abstract concepts with traditional semantic features. Global co-occurrence models, such as HAL, may provide a solution to part of this problem) – So, terms as dimensions in a global co-occurrence leads useful vector representations of abstract concepts – HALs results seem to be echoed by Newby using Principal Component Analysis on a term-term co-occurrence matrix

Text fragment = dimension For example, (term x document) matrix Latent semantic analysis produces vector representations of words in a reduced dimensional space: – LSA correlates with human information processing on a number of tasks, e.g., semantic priming – Landauer at al often use short fragments (dimension = 1 or 2 sentences) Dimensional reduction is apparently successful in re-producing cognitive compatibility, but the reason for this is unknown Determining the appropriate dimensional structure for IR models is still an open question, especially in light of cognitive aspects

Similarity: introductory remarks Similarity is central to many aspects of cognition: concept formation (learning), memory and perceptual organization Similarity is not an absolute notion but relative to a particular domain (or dimension) – an apple an orange are similar as they have the same shape – Similarity defined in terms of the number of shared properties leads to arbitrary similarity – a writing desk is like a raven Similarity is an exponentially decreasing function of distance N.B. clustering in IR often uses an absolute notion of similarity

Metric spaces A real-valued function d(x,y) is said to be a distance function for space S if it satisfies the following conditions for all points x, y and z in S: A space that has a distance function is called a metric space (There is debate about whether distance is symmetric from a psychological viewpoint. Eg Tversky et al Tel Aviv judged more similar to New York than vice versa. Gärdenfors accepts the symmetry axiom)

Equi-distance under the Euclidean metric Set of points at distance d from a point x form a circle Points between x and y are on a straight line x

Equi-distance under the city-block metric The set of points at distance d from a point x form a diamond The set of points between x and y is a rectangle generated by x and y and the directions of the axes x

Between-ness in the city-block metric x y All points in the rectangle are considered to be between x and y

Metrics: integral and separable dimensions For separable dimensions, calculate the distance using the city- block metric: – If two dimensions are separable, the dissimilarity of two stimuli is obtained by adding the dissimilarity along each of the two dimensions For integral dimensions, calculate distance using the Euclidean metric: – When two dimensions are integral, the dissimilarity is determined both dimensions taken together

Minkowski metrics Euclidean and city-block are special cases of Minkowski metrics: City-block: r = 1 Euclidean: r = 2

Scaling dimensions Due to context, the scales of the different dimensions cannot be assumed identical Dimensional scaling factor

Similarity as a function of distance A common assumption in psychological literature is that similarity is an exponentially decaying function of distance: The constant c is a sensitivity parameter. The similarity between x and y drops quickly when the distance between the objects is relatively small, while it drops more slowly when the distance is relatively large. The formula captures the similarity-based generalization performances of human subjects in a variety of settings

IR-related comments on similarity In the vector-space model, similarity is determined by the cosine function, which is not exponentially decaying IR models dont distinguish between integral and separable dimensions, even though this distinction is significant from a cognitive point of view Experience so far with computational cognitive models is mixed: – LSA uses cosine similarity (not exponentially decaying)!! – HAL used Minkowski (r = 1) to measure semantic distance, I.e a non- Euclidean distance metric was employed – (Non-Euclidean metrics should perhaps be explored)

Prototypes and categorical perception: introductory remarks Human subjects judge a robin as a more prototypical bird than a penguin Classifying an object is accomplished by determining its similarity to the prototype: – Similarity is judged w.r.t a reference object/region – Similarity is context-sensitive: a robin is a prototypical bird, but a canary is a prototypical pet bird Continuous perception: membership to a category is graded

Prototype regions in animal space reptile mammal bat platypus penguin bird robin emu archaeopteryx Based on Gärdenfors & Williams IJCAI 2001 Categorical perception: stimuli between categories distinguished with more ease and accuracy than within them

Computing categories in conceptual space: Voronoi tessellations Given prototypes require that q be in the same category as its most similar prototype. Consequence: partitioning of the space into convex regions

Voronoi Tessellations (cont) Much psychological data concords with tessellating conceptual spaces into star-shaped (and sometimes convex) regions around prototypes (e.g., stop consonants in phoneme classification Boundaries produced by Voronoi tesselations provide the threshold of similarity and support a mechanism explaining categorical perception Gärdenfors & Williams, Reasoning about categories in conceptual spaces, Proceedings IJCAI 2001

Part II Concept combination Induction Semantics Non-monotonic aspects of concepts Realizing (approximating) conceptual spaces