Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Cognitive Framework for Exploiting Context in Information Retrieval

Similar presentations


Presentation on theme: "A Cognitive Framework for Exploiting Context in Information Retrieval"— Presentation transcript:

1 A Cognitive Framework for Exploiting Context in Information Retrieval
Birger Larsen Information Interaction and Information Architecture Royal School of Library and Information Science Copenhagen, Denmark IR Seminar, University of Glasgow, January 25, 2010

2 Cognitive Framework for Exploiting Context
Outline The idea of polyrepresentation in Information Retrieval cognitive representations associated with users, documents and IR models Empirical evidence published to date Similar approaches – not adhering directly to polyrepresentation Results of experiments from polyrepresentative perspective On Information Space Combination of databases Combinations of search engines Combinations of document representations On Cognitive Space Work task perception and knowledge state inclusion Future work DCS, University of Glasgow Cognitive Framework for Exploiting Context

3 Cognitive Framework for Exploiting Context
Polyrepresentation First presented in 1994 / 1996 Originates in Peter Ingwersen’s work on establishing a theory for interactive IR from a cognitive point of view (1992) May be seen as an effort to demonstrate the applicability of this cognitive viewpoint not a formal mathematical theory, but rather presents a holistic framework emphasises the potential benefits in exploiting combinations of representations based on their cognitive origins DCS, University of Glasgow Cognitive Framework for Exploiting Context

4 Cognitive Framework for Exploiting Context
Polyrepresentation Central hypothesis: The more cognitively or typologically different representations (evidence; features) that point to an information object – and the more intensively they do so – the higher the probability that the object is relevant to the topic, the information need, the situation at hand, or the influencing context of the situation (The Turn, 2005, p. 208) DCS, University of Glasgow Cognitive Framework for Exploiting Context

5 Cognitive Framework for Exploiting Context
Polyrepresentation Why at all use Polyrepresentation today? Its all about context … and how to exploit different contexts It is integration … and might serve as a common framework for integrating various facets of IR and interaction It is oriented towards practical application Relatively few studies have so far directly implemented Polyrepresentation First presented as a ‘theory’ later as a ‘principle’... DCS, University of Glasgow Cognitive Framework for Exploiting Context

6 Cognitive Framework for Exploiting Context
Representations? A plethora of different preconditions and interpretations of the current situation: from different cognitive origins – cognitively different from the same origin, but displaying functionally different cognitive types, e.g. TI, AB, full text sections, table captions etc. from one author Performed in different styles depending on domain For instance, academic papers vs. blog entries vs. radio news broadcasts DCS, University of Glasgow Cognitive Framework for Exploiting Context

7 Users seeking information
Documents Users seeking information IR models & systems DCS, University of Glasgow Cognitive Framework for Exploiting Context

8 Features of Author’s responsibility
Interpretation by author(s) Full-text terms – Zipfian distributions Particular section terms (e.g. Introduction – XML structures) Title & section title terms Caption terms…. Image features Situational/domain interpretation by author(s) References & anchor texts (with cited names, journals, titles..) Out-links – with anchor text DCS, University of Glasgow Cognitive Framework for Exploiting Context

9 Isness: externally assigned objects features
Socio-cognitive interpretations & pertinence assessments – represented by: Journal name – conference… (peer reviews) Publication year / date (up to editors) Database name(s) (inclusion in bibliographical databases) Corporate source & Geo-location (employer) Citations, citedness – inlinks (recognition ...): Titles - authors of objects – journal… Web entities … DCS, University of Glasgow Cognitive Framework for Exploiting Context

10 Cognitive Framework for Exploiting Context
Polyrepresentative overlaps of cognitively & typologically different representations by one engine in information space - associated with one searcher statement in scholarly documents CITATIONS In-links to titles authors & passages AUTHOR(s) Text - images Headings Captions Titles References Out-links THESAURUS structure COGNITIVE OVERLAP SELECTORS Journal name Publication year Database(s) Corporate source Country INDEXERS Class codes Descriptors Document type Weights DCS, University of Glasgow Cognitive Framework for Exploiting Context

11 Traditional Boolean IR vs. Cognitive IR
Key union (in basic index) AND Journal Name (JN=) Cognitive overlaps: Key A/TI,DE JN= nnnnn Key A/DE Key A/TI JN=nnnnn DCS, University of Glasgow Cognitive Framework for Exploiting Context

12 Also in other media than text...
“...the Polyrepresentation principle may be applied to non-textual information objects of various media types and various genres...” (The Turn, p. 342) Example: polyrepresentation in the context of graphic objects (media) of pop art (genre) [by Danish students, 2009]. DCS, University of Glasgow Cognitive Framework for Exploiting Context

13 Cognitive Framework for Exploiting Context
Earlier use of features for IR – not adhering explicitly to polyrepresentation (or any other theory) Databases via (relevant) seed documents (Medline+SCI), McKain (1989), Pao (1994) Engines (probabilistic+vector space): I3R Croft & Thomson (1987) – overlaps not assessed for relevance (union: to increase recall; intersection: to increase precision) Weighting & indexing algorithms with human RF: Combinations seem to outperform individual algorithms, Ruthven, Lalmas & van Rijsbergen (2002) Different searcher statements: Combinations outperform single query formulations, Belkin et al. (1993) DCS, University of Glasgow Cognitive Framework for Exploiting Context

14 Polyrepresentation lessons
Some experiences from practical application of polyrepresentation: Skov, Larsen & Ingwersen (2004; 2008) Larsen, Ingwersen & Lund (2009) Kelly et al. (2005; 2007) – information space polyrepresentation White et al. (2006) – on relevance feedback and later Efron (2009) – automatic generation of pseudo relevance assessments DCS, University of Glasgow Cognitive Framework for Exploiting Context

15 Results of polyrep. experiments 2
Combinations of query representations (Skov et al., 2004; 2008) Cystic Fibrosis collection (1200 docs., +reference lists, freq. of citations, graded relevance, 29 topics) Tests of query structure; value-adding by MeSH-terms; use of reference title words+TI+AB+DE In total 15 different overlap combinations tested: DCS, University of Glasgow Cognitive Framework for Exploiting Context

16 Results for all 15 overlaps – restricted polyrepresentation
DCS, University of Glasgow Cognitive Framework for Exploiting Context

17 Skov et al.- applying weights to overlaps (Cumulated Gain values)
DCS, University of Glasgow Cognitive Framework for Exploiting Context

18 Results of experiments directly adhering to polyrepresentation 3
Results (Skov et al., 2008): The more cognitively different the representations in overlaps, the higher the precision; Combinations with reference title terms outperformed other combinations as well as individual searches Structured queries outperformed unstructured queries over all comb. Re-ranking by citation freq. decreased performance (small numbers though!) DCS, University of Glasgow Cognitive Framework for Exploiting Context

19 Overlap between different IR models (data fusion)
IR model X IR model Y IR model Z xy xz yz xyz Total cognitive overlap DCS, University of Glasgow Cognitive Framework for Exploiting Context

20 Two types of Polyrepresentation
Restricted/disjoint: Each document only in One overlap (by not logic): Documents in ‘fuse4’ are Not in the ‘fuse3’ overlaps. Relaxed/traditional: Documents in ‘fuse4’ also present in ‘fuse3’ & ‘fuse2’ overlaps, providing a list of documents that may be ranked by weights according to presence. DCS, University of Glasgow Cognitive Framework for Exploiting Context

21 Lund et al. – data fusion (30 TREC 5 topics, DCV = 100)
ETH & COR: SMART family UWG: special IR algorithm GEN: NLP - machine DCS, University of Glasgow Cognitive Framework for Exploiting Context

22 Cognitive Framework for Exploiting Context
Kelly et al. 2005, 2007 TREC HARD track: 13 searchers contributed 45 topics Searchers assessed relevance: off-topic; on-topic/relevant = relevant Use of clarification forms Q1: Times in the past searching topic? Q2: Describe what you already know about topic Q3: Why do you want to know about this topic? Q4: Please input any additional keywords that describe your topic. DCS, University of Glasgow Cognitive Framework for Exploiting Context

23 Overlap between different parts of the user’s cognitive structures
Request version Task / Goal Description Cognitive overlap from IR model X Precision Document set A Document set B Recall (Kelly …) DCS, University of Glasgow

24 Cognitive Framework for Exploiting Context
Kelly et al. 2005, 2007 cont. … Lemur toolkit – OKAPI BM25 engine, MAP + T-tests Baseline run: using terms from TREC topic title and description (BL) Experimental runs: BL + pseudo RF; BL + real RF; BL+Q2; BL+Q3 … Results: no. of query terms per source: BL: 9,33; Q2: 16.18; Q3: 10.67; Q4: 2.33 (considerable variation) Pseudo RF lower than baseline (.284), but pseudo50 better than BL All single Q and Q-combinations (weighted union) outperform Baseline (Q2+3+4: .368) Direct strong correlation between query length (BL > BL+Q4 > BL+Q3 …) and performance! DCS, University of Glasgow Cognitive Framework for Exploiting Context

25 Cognitive Framework for Exploiting Context
Concluding remarks Many possible ways of polyrepresentation yet to be tested Some indications from experiments demonstrate that the principle works – but: Care to be taken of which cognitively different structures to combine: low-performing engines/actors will reduce performance. Use best performing combined DCS, University of Glasgow Cognitive Framework for Exploiting Context

26 Cognitive Framework for Exploiting Context
Concluding remarks Unclear so far how citations (and inlinks) may perform: the time issue more robust tests should be performed including: bigger and more recent data sets graded relevance real searchers non-textual material contextual information (like implicit RF: White) Integration of geometric models and polyrepresentation? DCS, University of Glasgow Cognitive Framework for Exploiting Context

27 Cognitive Framework for Exploiting Context
References DCS, University of Glasgow Cognitive Framework for Exploiting Context


Download ppt "A Cognitive Framework for Exploiting Context in Information Retrieval"

Similar presentations


Ads by Google