Digging into Metadata (abridged) Michael Khoo, Xia Lin, Jae-wook Ahn Drexel University, USA Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Intute Repository Search Project A showcase for UK research output Sophia Jones SHERPA October.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Extracting Dewey Decimal Classifications from Dublin Core Metadata Records With the DISTIL Project: Preliminary Findings and Observations Michael Khoo.
A Middleware Registry for the Discovery of Collections and Services Ann Apps MIMAS, The University of Manchester, UK.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms (MOGA) Jia-Long Wu Alice M. Agogino Berkeley Expert System Laboratory U.C. Berkeley.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Antonella De Robbio, Dario Maguolo Mathematics Library – University Library System University of Padova – ITALY Mathematics Subject Classification and.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Cluj Napoca, 28 August IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Towards.
Some facets of knowledge management in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) Facets of Knowledge Organization A tribute.
Digital Library Architecture and Technology
PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Introduction to Information Retrieval CS 5604: Information Storage and Retrieval ProjCINETViz by Maksudul Alam, S M Arifuzzaman, and Md Hasanuzzaman Bhuiyan.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Automatic Subject Classification and Topic Specific Search Engines -- Research at KnowLib Anders Ardö and Koraljka Golub DELOS Workshop, Lund, 23 June.
Using IESR Ann Apps MIMAS, The University of Manchester, UK.
Azade Sanjari Saeed Moaddeli Amir Massoud Sadjadi Emad Khazraee 13th european conference on digital libraries.
Addressing the Metadata Bottleneck* *By Developing and Evaluating an Online Tool to Support Non-specialists to Evaluate Dublin Core Metadata Records Michael.
AuthorLink: Instant Author Co-Citation Mapping for Online Searching Xia Lin Howard D. White Jan Buzydlowski Drexel University Philadelphia,
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
The physics departments and documents network EUNIS Conference, Bled, June 29 th -July 2 nd 2004 Michael Schlenker: Dynamic.
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
Ms. Irene Onyancha ISTD/Library & Information Management Services United Nations Economic Commission for Africa The Second Session of the Committee on.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
JINR DOCUMENT SERVER: Current Status and Future Plans (From Open Access Repositories to Digital Libraries and to the Knowledge Infrastructure) I.Filozova.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The China Digital Museum Project.
Research Library, Los Alamos National Laboratory RESEARCH OAI4 - Geneva, Switzerland Digital Library Research & Prototyping Team Multi-Graph.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Indexing Mathematical Abstracts by Metadata and Ontology IMA Workshop, April 26-27, 2004 Su-Shing Chen, University of Florida
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
1 The NSDL Program Stephen Griffin National Science Foundation.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
6 th ECDL NKOS Workshop Organisers: Doug Tudhope Traugott Koch Marianne Lykke Nielsen NKOS Workshop, Budapest, 2007.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
Collaborative Query Previews in Digital Libraries Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information.
Research and Projects: Z, M, and Beyond! William E. Moen School of Library and Information Sciences Texas Center for Digital Knowledge University of North.
IESR, A Registry of Collections and Services: Using the DCMI Collection Description Profile in Practice Ann Apps MIMAS, The University of Manchester, UK.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
The JISC Information Environment Service Registry (IESR) Ann Apps Mimas, The University of Manchester, UK.
June 3-6, 2003E-Society Lisbon Automatic Metadata Discovery from Non-cooperative Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
Mapping the Network Landscape Ivette Serral
Major ILS disciplines What does iSchools like SILS study?
NSDL Data Repository (NDR)
Building a CMMI Data Infrastructure
Presentation transcript:

Digging into Metadata (abridged) Michael Khoo, Xia Lin, Jae-wook Ahn Drexel University, USA Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University of South Wales, UK Diana Massam, Hilary Jones MIMAS, University of Manchester, UK Digging Into Data Program Meeting, Montreal, October 12, 2013

Metadata Connects Data Problem space Plenty of metadata in DLs, but usually in silos Aim Federated discovery across heterogeneous DLs Educational DLs, Dublin Core metadata Support easy cross-DL browsing Method Harvest metadata into central bucket Run text analysis across metadata fields in each record (title, subject, description) and extract terms Use terms to generate DDC for each record Develop search/browse tools to run across the new DDC in the augmented bucket

National Science Digital Library Drexel U. Manchester U. South Wales NSDLIPLIntuteTotal 98,50740,973124,070263,550

Project Workflow Databases: MASH Metadata Aggregation Storage and Handling DISTIL Document Indexing & Semantic Tagging Interface for Libraries DRAMs Dynamic Representations of Annotated Metadata

Project Workflow Databases: MASH Metadata Aggregation Storage and Handling DISTIL Document Indexing & Semantic Tagging Interface for Libraries DRAMs Dynamic Representations of Annotated Metadata harvesting

Project Workflow Databases: MASH Metadata Aggregation Storage and Handling DISTIL Document Indexing & Semantic Tagging Interface for Libraries DRAMs Dynamic Representations of Annotated Metadata harvesting technical jiggery-pokery

Project Workflow Databases: MASH Metadata Aggregation Storage and Handling DISTIL Document Indexing & Semantic Tagging Interface for Libraries DRAMs Dynamic Representations of Annotated Metadata harvesting technical jiggery-pokery viz. tools

Dashboard

Build a DDC concept-to-concept graph in Gephi – Two nodes (concepts) are connected if their similarity score exceeds a certain threshold – Similarity scores calculated from the DDC codes retrieved from DISTIL Export Gephi graphs into sigma.js – JS interactive browser Users interact with the graph through the browser: – Overview  Show the distribution of all concepts and their structural/content-based clusters – Details  Selectively show the node labels – More details  By mouse over, show more detailed information of the nodes Network Analysis/Interactive Views

Network-based DDC browse

Network-based DDC browse

Network-based DDC browse

635.6 Edible garden fruits and seeds

664.8 Fruits and vegetables; commercial processing

635.5 Salad greens; garden crop; Salad greens

641.3 Food; food science; technology and engineering

633.2 Forage crops; forage crop; silage

641.6 Cooking specific materials; cooking with; salt …

To Do Systematic evaluation of individual project steps – What works, what does not – What is generalizable, extensible, scalable – Increase scope of technical jiggery-pokery (e.g. analyze full texts to produce browsable topic/subject-based network graphs) Significant refinement of interface and usability – User studies, mental models

Lessons Learned Collaboration takes a lot of work Good project management is useful Shared documents work really well Structured meetings work really well Face-to-face works really well

Early/Incremental Results Binding, C., Tudhope, D., Ahn, J-W., Khoo, M., Lin, X., Massam, D., & Jones, H. (2013). Digging Into Metadata. 12th European Networked Knowledge Organization Systems (NKOS) Workshop at the TPDL Conference, Valletta, Malta, Thursday 26th September Khoo, M., Ding, Y., Kowalczyk, S., & Mayernik, M. (2013). Managing Big Data and Big Metadata: Contributions From Digital Libraries ACM/IEEE Joint Conference on Digital Libraries, Indianapolis, IN, July 22-26, Khoo, M., Tudhope, D., Binding, C., Jones, H., and Orrego, I. (2013). OAI-PMH and Metadata Aggregation From Heterogeneous Digital Libraries: Three Case Studies. Conference Note: iConference 2013, Fort Worth, TX, February 12-15, Khoo, M., Tudhope, D., Binding, C., Abels, E., Lin, X., & Massam, D. (2012). 'Towards Digital Repository Interoperability: The Document Indexing and Semantic Tagging Interface for Libraries (DISTIL). Theory and Practice of Digital Libraries (TPDL) 2012, Paphos, Cyprus, September 23-27, Khoo, M. (2012). Digging Into Metadata. Invited panelist: "Library and Information Science in the Big Data Era: Funding, Projects, and Future." 75th Annual Meeting of the American Society for Information Science and Technology, Baltimore, MD, October 26-30, Khoo, M., Tudhope, D., & Binding, C. (2012). Extracting Dewey Decimal Classifications from Dublin Core Metadata Records With the DISTIL Project: Preliminary Findings and Observations. Position paper: 11th European Networked Knowledge Organization Systems (NKOS) Workshop, Theory and Practice of Digital Libraries (TPDL), Paphos, Cyprus, September 23-27, Khoo, M. (2012). Invited panelist: "Evaluating Digital Libraries - Methodologies and Challenges." Theory and Practice of Digital Libraries (TPDL), Paphos, Cyprus, September 23-27, 2012.

merci thank you

Alternate Interface(s)