Cluj Napoca, 28 August 2008 2008 IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Towards.

Slides:



Advertisements
Similar presentations
DILIGENT Digital libraries powered by the Grid Peter Fankhauser
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Technical and design issues in implementation Dr. Mohamed Ally Director and Professor Centre for Distance Education Athabasca University Canada New Zealand.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
1 Workshop on Novel Technologies for Digital Preservation of Cultural Heritage Collections, Ormylia, 21-22/5/2004 LABORATORIES ON SCIENCE AND TECHNOLOGY.
1 Fabrizio Sestini New Paradigms and Experimental Facilities DG Information Society and Media "The views expressed in this presentation are those of the.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
‘european digital library’ (EDL) Julie Verleyen TEL-ME-MOR / M-CAST Seminar on Subject Access Prague, 24 November 2006.
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Yannis Ioannidis University of Athens, Hellas Digital Libraries at a Crossroads Toward the Future Generation of Digital Library Mgmt Systems.
1 3 rd SG13 Regional Workshop for Africa on “ITU-T Standardization Challenges for Developing Countries Working for a Connected Africa” (Livingstone, Zambia,
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
SYNAT - the Polish National Research Content Infrastructure Wojtek Sylwestrzak, ICM Tomasz Rosiek, ICM Tomasz Krassowski, ICM Tartu, Estonia June 27, 2012.
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
Architecting an Extensible Digital Repository Anoop Kumar, Ranjani Saigal,Rob Chavez, Nikolai Schwertner Tufts University, Medford, MA.
Introduction to Computer and Programming CS-101 Lecture 6 By : Lecturer : Omer Salih Dawood Department of Computer Science College of Arts and Science.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Advances in Technology and CRIS Nikos Houssos National Documentation Centre / National Hellenic Research Foundation, Greece euroCRIS Task Group Leader.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
1 Seminar Presentation Multimedia Audio / Video Communication Standards Instructor: Dr. Imran Ahmad By: Ju Wang November 7, 2003.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
MOME MOME: An advanced measurement meta-repository IPS-MoMe Workshop, Warsaw, Poland March 14, 2005 Felix Strohmeier Authors:
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
KNOWLEDGE GRIDS Akshat Mishra GRID SEMINAR WINTER 2008 Feb 2008.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Freelib: A Self-sustainable Digital Library for Education Community Ashraf Amrou, Kurt Maly, Mohammad Zubair Computer Science Dept., Old Dominion University.
Scenarios for a Learning GRID Online Educa Nov 30 – Dec 2, 2005, Berlin, Germany Nicola Capuano, Agathe Merceron, PierLuigi Ritrovato
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
ON-line SERVICES based on DIGITAL DOCUMENTS Prof. Doina Banciu ROCS Bucharest, 2008.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Enabling the Future Service-Oriented Internet (EFSOI 2008) Supporting end-to-end resource virtualization for Web 2.0 applications using Service Oriented.
March 31, 1998NSF IDM 98, Group F1 Group F Multi-modal Issues, Systems and Applications.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Application Ontology Manager for Hydra IST Ján Hreňo Martin Sarnovský Peter Kostelník TU Košice.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
DSpace - Digital Library Software
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
DELOS Network of Excellence on Digital Libraries Yannis Ioannidis University of Athens, Hellas Digital Libraries: Future Research Directions for a European.
June 3-6, 2003E-Society Lisbon Automatic Metadata Discovery from Non-cooperative Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Empowering the Knowledge Worker End-User Software Engineering in Knowledge Management Witold Staniszkis The 17th International.
VI-SEEM Data Repository
University of Technology
Outline Pursue Interoperability: Digital Libraries
Data Warehousing and Data Mining
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Presentation transcript:

Cluj Napoca, 28 August IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Towards a GRID-Based Digital Library Management System. Gheorghe Sebestyén-Pál 1, Doina Banciu 2, Tünde Bálint 1, Bogdan Moscaiuc 1, and Ágnes Sebestyén-Pál 1 1- Technical University of Cluj-Napoca 2 - ICI Bucharest

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Content Classical vs. Digital Libraries Recent research on Digital Libraries (DL) Main issues and requirements for DLs An ontology-based DL model Grid-enabled DL Implementation considerations of a pilot DL Experiments Conclusions

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Classical vs. Digital Libraries Classical library a repository of knowledge organized mainly on paper Digital library Not only a digitized version of a classical library A new set of functionalities and services are added (e.g. access control, resources management and allocation, complex search and processing services, etc.) A data exchange and cooperation environment DLs are becoming digital content management systems Incorporates a wide variety of formats and data types ( text, audio, video, multi-document complex digital objects) Uses a variety of communication and data-exchange protocols and standards

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS IT and Communication technologies involved in the implementation of digital libraries

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Goals for modern DLs DELOS project’s vision – “to enable any person to access all human knowledge anytime and anywhere, in a friendly, multi-modal, efficient, and effective way, by overcoming barriers of distance, language, and culture and by using multiple Internet-connected devices” DL - a knowledge repository and an information exchange infrastructure that allows:  data generation,  processing and  seamless access to relevant information, regardless of the geographic distribution of hardware resources, databases or persons.

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Research in digital libraries Delos Network of Excellence – Goals: to define and implement digital libraries on new computing and communication technologies Achievements: definition of functional and architectural requirements for DL implementation BRICKS project Goals: to design a user and service-oriented space to share knowledge and resources in a multi-cultural heritage. Achievements: Definition of a digital library architecture for a very broad and heterogeneous user community; automatic indexing and annotation functionalities OpenDlib project Goal: development of a software toolkit for dedicated DLs generation Achievements: tools for content harvesting form existing resources Fedora, DSpace – open source software for DLs Lucene – open source Search engines

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Research in digital libraries (cont.) Diligent project (part of EGEE project) Goal: the use of GRID infrastructure for DL implementation Achievements: a new vision about the DL concept: DL = a dynamic digital content repository and management system dedicated for a purpose (e.g. a project, an art collection, an academic course) Definition of generic DL services mapped on GRID services DLs dedicated for different domains – with powerful processing capabilities SINRED project – National Excellency project Goal: development of a national framework for DLs specialized on technical sciences and research Achievements: evaluation of requirements, evaluation of existing software, infrastructure development, DL model definition, implementation of a pilot DL SIPADOC project – National research program Goal: reevaluation of the national patrimony through DLs Achievements: evaluation of digitizing tools

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Key issues in DL implementation Architectural issues: distributed nature of storage, processing and access resources Scalability, flexibility, interoperability Functional requirements: Core functions: storage, indexing and annotation, data-search, content retrieval, users management Content organization should reflect semantic connections Processing facilities Data processing services – specialized for different fields Pattern search and recognition QoS issues Restricted time to obtain relevant information Reasonable time for complex data processing User and access control management Virtual organizations Role-based access

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS DL = Essence & Metadata Management Text Audio Video Text Digital content generation and harvesting Management of essence Automatic feature (metadata) extraction Metadata Management Cataloging, indexing, annotation Access and visualization Cataloging information system

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS An ontology-based Digital Library approach Ontology: concepts and relations together with a reasoning engine Ontology for technical and scientific domains Main concepts: Digital objects:  association of content, metadata and procedures  Examples: articles, technical reports, prospects, PhD Thesis, patents Digital collections  Set of digital objects structured for a given goal/purpose of based on a given criterion  Examples: articles of an author, documents of a domain Events  Conferences, workshops, seminars Processes  Projects  Courses Virtual organizations  Roles  users

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Grid-enabled digital library services Why DLs on GRID infrastructure? Huge volume of documents/digital objects Concurrent access and multiple search engines (see Google) Multimedia streaming Automatic indexing and annotation Complex processing requires prohibitive time User management through virtual organizations Job distribution facilities offered by GRID

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS DL functions mapped on GRID services Computing, storage and communication resources Digital Library GRID Services Collections management Catalog and metadata management Digital objects management Users’ management Data visualization Virtual organizations management Resource management Task distribution Processing Data distribution and replication Data processing

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experiments Two approaches: DL implementation on Alchemi GRID (Microsoft) Job distribution at thread level Explicit GRID programming Experiments with multimedia streaming (multimedia content distribution) DL implementation on Condor GRID (Open source) Job distribution at task level Job and data distribution is transparent to the DL application ( distribution is made through separate scripts) Experiments with “key-word search” in the whole DL content  The execution time decreased with the number of executor computers  For more than 5 executors the scheduling and communication time is comparable with the execution time

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS A pilot implementation of a Digital library framework developed with GRID support Goal: implementation of a digital content storage and retrieval system dedicated for educational and scientific activities (courses, projects, etc.) Main requirements: A DL adaptable for a given purpose/goal Access controlled and restricted with virtual organizations Ontology-based approach (concepts, relations, semantic search) Advanced search procedures GRID-enabled full-text search services – for better reaction time Access through Internet browsers The result: A distributed digital library application, which allows: Management of digital objects (upload, storage, indexing, metadata creation Management of collections Management of users and virtual organizations

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Pilot DL details: ( Management of digital objects Digital Documents’ upload, Annotation, metadata generation according with Dublin Core Distributed Storage of data Management of collections Define a new collection Attach new documents to an existing collection Associate access rights to a collection Management of users and virtual organizations Define new users and new virtual organizations Define roles Associate roles to users and collections

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Snapshots of the DL application’s interface bib-dig.utcluj.ro

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Snapshots of the DL application’s interface

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Search techniques in DLs through key-word or index search: Database techniques through semantic Information Retrieval: Semantic graph with documents and concepts through non-semantic Information Retrieval: Naive Bayes Algorithm Probabilistic approach Based on probabilistic similarity between documents Topic-Based Vector Space Model Algorithm

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experimental results

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Experiments

Debrecen, 3-5 September 2008, DAPSYS’08 7th INTERNATIONAL CONFERENCE ON DISTRIBUTED AND PARALLEL SYSTEMS Conclusions DLs are complex content management systems that extend the functionalities of classical libraries: Semantic organization of a wide variety of information formats Multiple search and data retrieval techniques (including full-text and semantic search): Key-word full-text search Semantic search Statistical and probabilistic retrieval and classification Access control to distributed and remote data DLs are Data exchange and cooperation environments Useful for remote and cooperative work DLs must include powerful search and data retrieval engines GRID infrastructures may be a feasible support in the implementation of DLs For more efficient parallel search, classification or automatic annotation

Cluj Napoca, 28 August IEEE International Conference on Intelligent Computer Communication and Processing Digital Libraries Workshop Thank you for your attention Questions ?