Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID as a Technology Overview, Participation and Related Projects.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

© 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.
Web Service Architecture
MIT Lincoln Laboratory A Service-Oriented Approach to Application Development Robert Darneille & Gary Schorer WPI MQP Presentations ICS Group 10 October.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Personal Data Management Why is this such an issue? Data Provenance Representing links v Representing data Identifying resources: Life Science Identifiers.
Interpret Application Specifications
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation Haystack: Bringing Good Metadata to Life Dennis Quan
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
What Can Do for You! Fabian Christ
The Design Discipline.
Interoperability Scenario Producing summary versions of compound multimedia historical documents.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
CDM Developer Workshop. TDWG Andreas Kohlbecker Taxonomic Workflow in the EDIT Platform for Cybertaxonomy Purpose What do you want from this workshop?
Mobile Topic Maps for e-Learning John McDonald & Darina Dicheva Intelligent Information Systems Group Computer Science Department Winston-Salem State University,
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Domain Modeling In FREMA David Millard Yvonne Howard Hugh Davis Gary Wills Lester Gilbert Learning Societies Lab University of Southampton, UK.
Teranode Tools and Platform for Pathway Analysis Michael Kellen, Solution Manager June 16, 2006.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
New Ideas for IA Readings review - How to manage the process Content Management Process Management - New ideas in design Information Objects Content Genres.
Andrew S. Budarevsky Adaptive Application Data Management Overview.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
STASIS Technical Innovations - Simplifying e-Business Collaboration by providing a Semantic Mapping Platform - Dr. Sven Abels - TIE -
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
ITGS Databases.
LSIDs in a Nutshell Jun Zhao University of Manchester 1 st December, 2005.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
OWL Representing Information Using the Web Ontology Language.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008.
© 2006 University of Kansas An LSID resolver for specimens and a digression into issues raised by the use of GUIDs Steve Perry
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Introduction to Active Directory
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Converting an Existing Taxonomic Data Resource to Employ an Ontology and LSIDS Jessie Kennedy Rob Gales, Robert Kukla.
EMEA Beat Schwegler Architect Microsoft EMEA HQ Ingo Rammer Principal Consultant thinktecture
Java Programming: Advanced Topics 1 Building Web Applications Chapter 13.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Life Science Identifiers Chris Wroe (based on material from myGrid team and IBM Life Sciences)
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Open Source distributed document DB for an enterprise
Jessie Kennedy Rob Gales, Robert Kukla
Dr. Awad Khalil Computer Science Department AUC
LOD reference architecture
Dr. Awad Khalil Computer Science Department AUC
Presentation transcript:

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID as a Technology Overview, Participation and Related Projects

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 My background  LSID and Semantic Web for 3 years – LSID Java Toolkit – OMG Specification – BioitWorld 2004, BIO 2003  Semantic Web Research Interests – Semantic web through social computing – (Semantic Web)-application development – Semantic (Web-application) development – Semantic workflows

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Overview - Syntax  5 Part Format: urn:lsid:authority:namespace:object[:revision] – urn:lsid Mandatory prefix – a uthority Unique string, e.g. domain name of organization – namespace Alphanumeric sequence that constrains the scope –E.g. to a particular database, species, etc … – object Alphanumeric sequence describing the object – [revision] Optional alphanumeric sequence describing the version of the object  Example: urn:lsid:ncbi.nlm.nih.gov:genbank:af  Example: urn:lsid:pdb.org:pdb:1aft:1

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Overview - Resolution DNS/DDDS LSID Authority Metadata Service Data Service Client 1a - DDDS NAPTR 1b - SRV Record Lookup 2 - getAvailableServices() WSDL 3a - getData() 3b - getMetadata()

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Overview – Comparison with URLs  Tied to physical addresses  Server structure may change frequently  Brittle (broken links)  One location only  One protocol per URI URL LSID  Same name = same content, always  Location independent  Enables transparent caching  Formalized, rich multi-sourced metadata retrieval

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Overview – Implementation Basics  Accessing LSID Data is as easy as – Opening a stream to the data,metadata – Reading the metadata to acquire context  Providing data via LSID is as easy as – Logically assigning LSIDs to data items – Implementing a simple API (getData(),getMetadata()) – Deploying a web application  Example Genbank (NIH nucleotide database) – Logically Assign LSIDs based on accession # – Access Genbank Data via WSDL defined Web Service – Convert Genbank WSDL generated objects to OWL generated objects (Jastor)

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Overview – summary of advantages  Location independence and high availability provided by DDDS NAPTR, DNS SRV, and WSDL  Multiple data mirrors, metadata sources provided by WSDL  Authority may be used to provide references to additional services: search, BLAST, etc …  Metadata, describes attributes and relationships  Easy implementation and use by anyone

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Metadata vs. Data  Should there exist metadata-only LSIDs?  Certainly! – Abstract or conceptual LSIDs: ex: an LSID that contains only metadata about an image, but that points to multiple LSIDs containing the image data in different formats – LSIDs that reference complex objects in a database. – LSIDs that link together groups of LSIDs (ex. synonyms) LocusLink

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Metadata vs. Data  What happens to consumers of an LSID if the metadata changes?  Remember, though we use RDF for metadata, nothing prevents us from returning immutable RDF as data – Problem: graph equality does not imply byte equality – Solution: materialize RDF serialization once, assign LSID and cache it. If the underlying object changes, create a new serialization with a new version.

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Participation - Organizations  I3C Origins (folded into W3C) – original body responsible for LSID – BioIT World, BIO  Object Management Group (OMG) – holds the current standard  BioPathways Consortium – Hosts 3 rd party LSID resolution services  IBM – Contributor to standard, open source implementations – Technical support for early adopters

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 LSID Participation – Early Adopters  University of Wisconsin CFL  Biomoby  Mygrid (European e-Science)  Ecological Society of America Data Registry  Lawrence Berkeley Labs  Broad Institute of Genomics  Many more, just Google “ urn:lsid: ”

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Cambridge Adtech

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - SLRP  Semantic Layered Research Platform  RDF-based system for managing laboratory experiments – Papers – Workflow – People – Provenance – Data  Initially developed for CViT.org  Composed of many reusable and standalone components

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - CART  RDF triples stored in central relational database  [C] Triples are grouped into collections – LSID resolution service serves collections of RDF  [A] ACL ’ s specified at the collection level  Clients maintain local subsets of the triple store based on what they are interested in.  [R] Client stores are updated by pub/sub messaging (push) and replication (pull).  Client can “ track ” sets of triples based on triple patterns or collections.  [T] Updates to the central store are performed in transactions

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - DDR  Distributed Data Repository  Designed to assign LSIDs to newly created data – text documents, images, spreadsheets, workflow output, images, etc …  Highly concerned with versioning and access control  Stores metadata in CART.  Summary: CART + DDR is a powerful LSID implementation platform for file data.

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - Slingshot  Distributed OWL-S execution engine  Workflow state stored centrally in CART.  Participants subscribe to the collection representing the workflow document and perform tasks when it is their turn.  Result data stored as LSIDs in DDR, referenced in OWLS document in CART.

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - Telar  Writing apps against a single Jena Model is (relatively) easy  In the real world, apps must query, update, and perform inference across multiple models.  Telar provides libraries for building such real- world RDF applications  Telar-UI provides libraries for building RDF and Ontology driven user interfaces on the Eclipse platform.

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects – jastor.sourceforge.net  RDF structure is defined by OWL ontologies – Partially Java-style object oriented: classes, subclasses. – Additional constructs: unions, intersections multiple inheritance  RDF manipulation in Java using pure Jena is difficult – Lots of verbose error checking required – No ontology-driven compile-time checking  Jastor generates APIs directly from OWL ontologies – Compile-time checking of ontology-compliance, ontology changes -> compile-time errors – Syntax assistance in IDEs (Eclipse) – Programmer shielded from tedious error checking  Auto-generation of data-access API ’ s is a good programming practice

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - Odo  Trying to do some (or all!) of the above in Perl.

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - Annotation  Windows client library for writing plugins to Annotate parts of documents  Plugins exist for Acrobat, Word, Power Point, Excel and IE  Client communicates to Annotation Server via a Web Service  Annotation Data stored in RDF

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Adtech Semantic Web Projects - Summary  We have lots of cool (and hopefully useful) prototypes going on.  We are interested in hearing about LSID and Semantic Web scenarios and applications.  We would happily host any interested parties at our lab in Cambridge, Mass for a morning, afternoon or day of demos and discussion

Ben Szekely, IBM Cambridge Adtech © 2006 IBM Corporation TDWG GUID WorkshopFebruary 1, 2006 Questions, Comments, Concerns, Complaints ?