Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enhancing social tagging with a knowledge organization system Brian Matthews STFC.

Similar presentations


Presentation on theme: "Enhancing social tagging with a knowledge organization system Brian Matthews STFC."— Presentation transcript:

1 Enhancing social tagging with a knowledge organization system Brian Matthews STFC

2 Outline  Who are STFC ?  Controlled Vocabulary  Social Tagging  EnTag –Aims –Glamorgan/UKOLN/Intute Experiment –STFC Experiment  SKOS

3 Science and Technology Facilities Council  Provide large-scale scientific facilities for UK Science –particularly in physics and astronomy  E-Science Centre – at RAL and DL –Provides advanced IT development and services to the STFC Science Programme –Also includes library and institutional repository –Strong interest in Digital Curation of our science data –Keep the results alive and available –R&D Programme: DCC, CASPAR EnTag

4 Controlled Vocabulary  Traditional way of providing subject classification –For shelf-marking –For searching –For association of resources  Several different types used, such as –Subject Classification –Keyword lists –Thesaurus  Each has different characteristics

5 HASSET (I)  UK Data Archive, Univ of Essex  Humanities and Social Science Electronic Thesaurus  Some 1000’s of terms  Structure based on British Standard 5723:1987/ISO 2788-1986 (Establishment and development of monolingual thesauri).  preferred terms, broader- narrower relations, associated terms http://www.data- archive.ac.uk/search/hassetSearch.asp

6 HASSET (II)

7 HASSET (III)

8 Observations on using controlled vocabularies  Precise classification of resources –Good for precision and recall  Can exploit the hierarchy to modify query –Using the broader/narrower/related terms  Highly expensive –Requires investment in specialist expertise to devise the vocabulary –Requires investment in specialist expertise to classify resources.  Hard to maintain currency

9 Social Tagging  The Web 2.0 way of providing search terms  People “tag” resources with free-text terms of their own choosing  Tags used to associate resources together  del.icio.us, flickr  “Folksonomy” –the terms a community choses to use to tag its resources.

10 Connotea

11 Connotea – sharing tags

12 Connotea –Tag Cloud

13 Observations on Social Tagging  People often use the same tags or keywords (e.g. Preservation, Digital Library) –this makes things which mean the same thing to people easier to find  Cheap way of getting a very large number of resources marked up and classified –Represents the “community consensus” in some sense –“The Wisdom Of Crowds” –Has currency as people update –Tag clouds of popular tags  However, people often use similar but not the same tags: –e.g. Semantic Web, SemanticWeb, SemWeb, SWeb  People make mistakes in tags –mispellings, using spaces incorrectly.  Some tags are more specific than others: –E.g. controlled vocabulary, thesaurus, HASSET  People often associate the same words together with particular ideas in images –these are captured in clusters

14 EnTag Project Enhanced tagging for discovery  JISC funded project  Partners – UKOLN – University of Glamorgan –STFC –Intute –Non-funded OCLC Office of Research, USA Danish Royal School of Library and Information Science Period: 1 Sep 2007 -- 30 Sep 2008 http://www.ukoln.ac.uk/projects/enhanced-tagging/

15 EnTag Background  Controlled vocabularies –Improve information retrieval and discovery –But, costly to index with, especially the amount of digital documents –Require subject and classification experts  Social tagging –Holds the promise of reducing indexing costs –Uses terms describing how people see the resource –Serendipity –But, tags uncontrolled, missed associations Relating different views Highly personal (“me”, “important”), Quality and ranking Depth of term

16 EnTag Purpose Investigate the combination of controlled and social tagging approaches to support resource discovery in repositories and digital collections Aim to investigate –whether use of an established controlled vocabulary can help move social tagging beyond personal bookmarking to aid resource discovery

17 EnTag Objectives Investigate indexing aspects when using only social tagging versus when using social tagging in combination with a controlled vocabulary In particular, does this lead to:  Improve tagging –Relevance of tags (perspective, aspects, specificity, exhaustivity, terminology (linguistic level, semantic level, contextual level) –Consistency –Efficiency (time used, user satisfaction) –Use (tags selected, clouds consulted, order of consultation)  Improve retrieval –Effectiveness (degree of match between user and system terminology) In two different contexts: –Tagging by readers –Tagging by authors

18 Testing Approach Main focus: –free tagging with no instructions Versus –tagging using a combined system and guidance for users Two demonstrators  Intute digital collection http://www.intute.ac.ukhttp://www.intute.ac.uk –Major development –Tagging by reader –DDC STFC repository http://epubs.cclrc.ac.uk/http://epubs.cclrc.ac.uk/ –Complementary development –Tagging by author –A more qualitative approach

19 Intute

20 Intute demonstrator: searching

21 Intute demonstrator : basic tagging

22 Intute demonstrator: enhanced tagging

23 EnTag: Intute user study (II) Test setting – 50 graduate students in political science –60 documents, covering up to four topics of relevance for the students Data collection –Logging time spent, selection patterns, –Pre- and post-questionnaires

24 EnTag: Intute user study (I) Test: comparison of basic and advanced system: –Indexing –Perspective, specificity, exhaustivity –Linguistics (word class, single word/compound, spelling, language) –Consistency –Efficiency (time used, user satisfaction) –Use (tags selected, clouds consulted, order of consultation) –Retrieval efficiency  Degree of match between user and system terminology –user tags, DDC tags, controlled Intute keywords, title terms, text terms

25 STFC Case Study: EPubs

26 STFC demonstrator

27 STFC Author study  A study on a Authors of papers –Smaller number - c.10-12. –Regular depositors ( > 10 papers each) –Subject experts  Expect that they would want their papers accurately tagged so that they are precisely found  A more qualitative study

28 Expected Feedback  Relative value of tagging vs. controlled terms –Does it give more satisfactory (accurate, consistent) tags? –Does it lead to the consideration of tags they would not have thought of? –Do they select deeply in the hierarchy? –Is this something they would like to see supported more, and would use? –Is it worth the overhead?  How we should use a combination of tagging and controlled vocab in our system ? To Be Continued…..

29 Building a Web of Knowledge  Social tagging and controlled vocabulary complement each other –Tagging entry level, quick, does the job, but error prone, fuzzy –Controlled vocabulary, accurate, but slow and expensive  Use one to leverage the other  Use both to build a “Web of knowledge” –The things in the world and their link via their subjects –Get the users to build the means of organising the knowledge

30 http://purl.org/net/aliman30

31 SKOS: Simple conceptual relationships

32 Conclusions  Controlled vocabulary and Tags complement each other  Hope to get some interesting evidence over the next month as the studies are complete.  Web 2.0 world offers the possibility of combining these results –SKOS a format to use both tags and controlled vocabulary as part of the Web of Linked Data –Also use Web 2.0 to build the vocab themselves.

33 Questions? b.m.matthews@rl.ac.uk

34 EnTag – Enhanced tagging for discovery Research collaboration between Glamorgan University, UKOLN, INTUTE, CCLRC, OCLC, and DB Financed by JISC Capital Programme Research goal: Investigation of the combination and comparison of controlled and folksonomy approaches to semantic interoperability supporting resource discovery in repositories and digital collections Evaluation in two communities of use: at Intute (Social science), focussing on tagging by readers (postgraduate users), and at CCLRC, focussing on tagging by authors The two studies are carried out as separate projects Intute project use DDC as controlled vocabulary Evaluation by quantitative and qualitative measures

35 Evaluation Intute – focus and objective Context : tagging as part of information searching and relevance assessment, tagging for recommendation and sharing Hybrid system : investigate whether tagging can be improved by a combination of traditional tag clouds and clouds of controlled descriptors, including interactive tools such as tag suggestions, access to browsing of DDC, etc. Improve tagging Relevance of tags (perspective, aspects, specificity, exhaustivity, terminology (linguistic level, semantic level, contextual level) Consistency Efficiency (time used, user satisfaction) Use (tags selected, clouds consulted, order of consultation) Improve retrieval Effectiveness (degree of match between user and system terminology)


Download ppt "Enhancing social tagging with a knowledge organization system Brian Matthews STFC."

Similar presentations


Ads by Google