Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Management Information Systems, Sixth Edition
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
CIM2564 Introduction to Development Frameworks 1 Overview of a Development Framework Topic 1.
1 THE HEALTH iNNOVATOR An Integrated Care Record Service The Durham & Darlington Approach The Simulator.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
1 MPEG-21 : Goals and Achievements Ian Burnett, Rik Van de Walle, Keith Hill, Jan Bormans and Fernando Pereira IEEE Multimedia, October-November 2003.
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Management of information. Objectives Discuss the benefits of good management practice Present reference management tools Present bookmark management.
Passage Three Introduction to Microsoft SQL Server 2000.
ACAT 2008 Erice, Sicily WebDat: Bridging the Gap between Unstructured and Structured Data Jerzy M. Nogiec, Kelley Trombly-Freytag, Ruben Carcagno Fermilab,
ISO/TC211 Geographic Information/Geomatics Implementing ISO Metadata David Danko Work Item 15—Project Leader
Digital Library Architecture and Technology
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
Event-Based Model for Reconciling Digital Entries Thesis Proposal Ahmet Fatih Mustacoglu 10/3/20151Ahmet.
Fundamentals of Database Chapter 7 Database Technologies.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
I # C * CELLPHONE SHOPPER Project Proposal Graham Hunter | Marc Pelteret | Tshifhiwa Ramuhaheli Supervisor: Hussein Suleman 11 May.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Module 5 A system where in its parts perform a unified job of receiving inputs, processes the information and transforms the information into a new kind.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Fisheries Oceanography Collaboration Software Donald Denbo NOAA/PMEL-UW/JISAO Presented by Nancy Soreide NOAA/PMEL AMS 2002/IIPS 10.3.
Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
1 Advanced Software Architecture Muhammad Bilal Bashir PhD Scholar (Computer Science) Mohammad Ali Jinnah University.
@ 2008 Copyright NIC I Do not distribute without permission E-Services for Transforming to the Next Generation Government “A Case Study of India” Suchitra.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
1 Semantic Research Grid Open Grid Forum Web 2.0 Workshop OGF21, Seattle Washington October Geoffrey Fox, Aurel Cami, Ahmet Fatih Mustacoglu, Ahmet.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
SRG: A Digital Document-Enhanced Service Oriented Research Grid Ahmet E. Topcu Ahmet Fatih Mustacoglu Geoffrey C. Fox Aurel Cami Indiana University Computer.
ICalendar Compatible Collaborative Calendar- Server (CCS) Web Services Ahmet Fatih Mustacoglu Indiana University Computer Science Department Community.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Internet Documentation and Integration of Metadata (IDIOM) Presented by Ahmet E. Topcu Advisor: Prof. Geoffrey C. Fox 1/14/2009.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.
Jens Hartmann York Sure Raphael Volz Rudi Studer The OntoWeb Portal.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
Event-Based Model for Reconciling Digital Entities Ahmet Fatih Mustacoglu Ahmet E. Topcu Aurel Cami Geoffrey C. Fox Indiana University Computer Science.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
James A. Senn’s Information Technology, 3rd Edition
Architecture Review 10/11/2004
Database Management:.
Introduction to Cloud Computing
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Ahmet Fatih Mustacoglu
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox.
Integrated Collaborative Information Systems
Database System Concepts and Architecture
SDMX IT Tools SDMX Registry
Presentation transcript:

Event-Based Infrastructure for Reconciling Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey C. Fox

Outline  Introduction  Motivations and research issues  Architecture  Event-Based Infrastructure  Measurements and Analysis  Conclusions  Contributions and Future Works 22/24/2016Ahmet Fatih Mustacoglu

Online Collaboration  Rapid development of annotation tools and services  Aimed at fostering online collaboration and sharing between users and communities:  Bookmarking Tools supports annotation using keywords called tags and sharing  e.g. del.icio.us  Tools for annotation and sharing of scholarly publications  Connotea  Citeulike  Bibsonomy  Social Networking Tools  e.g. MySpace, and Facebook  Video Sharing and annotation  e.g. YouTube 32/24/2016Ahmet Fatih Mustacoglu

Motivations  Various annotation tools, different and limited metadata storage  Multiple instances of metadata about the same document  No time-stamp info for updated records  Causing inconsistencies  Lack of interoperability between annotation sites  Applying service-based architecture to annotation systems  Unification and Federation of major annotation tools to use them with added capabilities for scientific research  Management of metadata coming from different sources  Adding missing services  Upload and extract metadata from/to a repository 42/24/2016 Ahmet Fatih Mustacoglu

Research Issues I  Need an infrastructure to manage metadata  Dealing with metadata coming from several sources  Issues with using annotation tools and their services with added capabilities  Extract and upload data to/from tools  More metadata support for documents  Providing communication between annotation tools  Issues with document tracking and access to previous versions of documents  Consistency Enforcement  Issues with maintaining consistency between copies of a record stored at various annotation tools 52/24/2016Ahmet Fatih Mustacoglu

Research Issues II  Unification  How to combine different annotation tools under the same umbrella?  Federation  How to federate major annotation tools?  Scalability  System behavior for increased message rate per second  Flexibility and Extensibility  Interoperable with other clients  Ease of integrating an annotation tool 6 2/24/2016Ahmet Fatih Mustacoglu

7 Event-based Infrastructure and Consistency Enforcement Architecture 2/24/2016

KEY CONCEPTS  Distributed Annotation Record (DAR): Collection of metadata stored at an annotation tool.  Digital Entity (DE): A digital collection of metadata for a citation stored in a system database forms a primary copy of a DAR.  Event: A time-stamped action on a digital entity  Major Events:  Insertion or deletion of a digital entity  Minor Events:  Modifications to an existing digital entity 2/24/2016Ahmet Fatih Mustacoglu8

Communication Manager  Responsible for providing communication between annotation tools and update manager and digital entity manager via gateways  e.g. Connotea gateway  Utilizes a gateway for each annotation tool, and a parser  Retrieve records in XML format  Parse and pass records to update manager  Post updates coming from digital entity manager to annotation tools 92/24/2016Ahmet Fatih Mustacoglu

10 Communication Manager 2/24/2016Ahmet Fatih Mustacoglu

Gateway  Interface between Event-based infrastructure and each annotation tool  Provides extensibility  A gateway needs to be deployed for each annotation tool that need to be integrated into the system 11 Gateways EBI Modules EBI Annotation Tools 2/24/2016Ahmet Fatih Mustacoglu

Annotation Tools Update Manager  Responsible for:  Retrieving the records from annotation tools periodically (Time-based consistency approach by pulling records)  Finding out the updates  Passing the updates to Digital Entity Manager so that they can be applied on the primary copy of each record 122/24/2016

Digital Entity Manager  Responsible for:  Events and dataset creation  Event Processing  Manages updates made on the primary copy of a digital entity  Updates primary copy located on a system database  Pass updates to the Communication Manager (Strict consistency by pushing updates immediately)  Handles periodic update management  Deals with history and rollback management of a digital entity 13 2/24/2016

Key Design Features  Representation of metadata of documents coming from various sources as events  Major and minor events  More metadata support than major current annotation tools  Ability to access and rollback to previous versions of documents  Unification and Federation of Connotea, Delicious, and Citeulike tools and support for web-based academic search tools for scientific research  Using annotation tools’ existing services with added capabilities  Support major online search tools to collect metadata  Provides communication among annotation tools  Leveraging interoperability via service-enabled architecture  Keeps records located at annotation tools and a system database consistent with each other  Adopting time-based and strict consistency approaches 2/24/2016Ahmet Fatih Mustacoglu14

Use Cases  Collaborative Tagging  Updating or assigning keywords to records  Collecting and managing citation metadata  Obtaining metadata about a publication through online scholarly search tools or annotation tools  Unification and Federation of Connotea, Citeulike and Delicious annotation tools  Providing schema and communication among them  Tracking updates to documents  Rolling back to previous states  Building versions of documents based on  Users, groups, or all events 2/24/2016Ahmet Fatih Mustacoglu15

Benchmarks and Environments  Message rate scalability investigation  MoreInfo operation  With DB Access  With Memory Utilization  Update DE operation  We have used:  Java 2 Standard Edition compiler with version 1.5.0_12. The maximum heap size of Java Virtual Machine (JVM) to1024MB  Apache Tomcat Server with version  Apache Axis technology with version /24/2016Ahmet Fatih Mustacoglu

172/24/2016Ahmet Fatih Mustacoglu

Message rate scalability investigation result (DB Usage) - I 182/24/2016Ahmet Fatih Mustacoglu

Message rate scalability investigation result (Memory Utilization) - II 192/24/2016Ahmet Fatih Mustacoglu

Message rate scalability investigation result (Update DE) - III 202/24/2016 Ahmet Fatih Mustacoglu

Overheads for updating Memory and DB 2/24/2016Ahmet Fatih Mustacoglu21 Message Rate (message/sec) Overhead Time (DB) (msec) STDev for DBOverhead Time (Memory) (msec) STDev for Memory

Contributions  System research  Event-based Infrastructure  Unification, Federation and Interoperability of Connotea, Delicious and Citeulike annotation tools  Strategies for increasing performance and scalability via in top-to bottom approach and memory utilization  Handling various types of metadata coming from several sources  Flexibility to access previous versions of a document  Adopting consistency enforcement approaches to maintain consistency  Comprehensive benchmarks to evaluate the scalability of the prototype system  System software  An implementation of Event-based Infrastructure of Internet Documentation and Integration of Metadata (IDIOM) system  An implementation of consistency maintenance mechanism for Internet Documentation and Integration of Metadata (IDIOM) system 222/24/2016 Ahmet Fatih Mustacoglu

Future Works  Applying Event-based Infrastructure to broader range of application use cases  Supporting video collaboration tools (e.g. YouTube)  Social networking (e.g. Facebook)  Unification and Federation of other academic collaboration and publication tools into EBI  e.g. BibSonomy  From a single storage of metadata to distributed storages 232/24/2016Ahmet Fatih Mustacoglu

Publications  Book Chapters 1.Web 2.0 for Grids and e-Science; Geoffrey C. Fox, Rajarshi Guha, Donald F. McMullen, Ahmet Fatih Mustacoglu, Marlon E. Pierce, Ahmet E. Topcu, David J. Wild. Published by Springer, Grid Enabled Remote Instrumentation (Chapter: Web 2.0 for Grids and e-Science)  Publications 1.Hybrid Consistency Framework for Distributed Annotation Records in a Collaborative Environment; Ahmet Fatih Mustacoglu and Geoffrey Fox 2.Web 2.0 for E-Science Environments Keynote Presentation; Geoffrey C. Fox, Marlon E. Pierce, Ahmet Fatih Mustacoglu, Ahmet E. Topcu 3.Integration of Collaborative Information Systems in Web 2.0; Ahmet E. Topcu, Ahmet Fatih Mustacoglu, Geoffrey Fox, Aurel Cami 4.SRG: A Digital Document-Enhanced Service Oriented Research Grid; Geoffrey Fox, Ahmet Fatih Mustacoglu, Ahmet E. Topcu, Aurel Cami 5.AJAX Integration Approach for Collaborative Calendar-Server Web Services; Ahmet Fatih Mustacoglu, Geoffrey Fox 6.A Novel Event-Based Consistency Model for Supporting Collaborative Cyberinfrastructure Based Scientific Research; Ahmet Fatih Mustacoglu, Ahmet E. Topcu, Aurel Cami, Geoffrey Fox 7.iCalendar (RFC2445) Compatible Collaborative Calendar-Server Services; Ahmet Fatih Mustacoglu, Wenjun Wu, Geoffrey Fox 242/24/2016Ahmet Fatih Mustacoglu

Tools for Annotation and Sharing Publications  They are used for:  Collecting data and metadata  Annotating data  Sharing papers  Limitations of these tools:  Different and limited metadata storage  Need to enter same entry to each tool  No timing information for updated records  Lack of ability to transfer data between tools  Lack of services to extract and import data into a repository  Lack of services to upload data from a repository 252/24/2016Ahmet Fatih Mustacoglu

2/24/2016Ahmet Fatih Mustacoglu26