LIS901N: URI Thomas Krichel 2003-01-??. URIs (background) URI: uniform resource identifier Originally, a generalization of: –URL (uniform resource locator),

Slides:



Advertisements
Similar presentations
Doi> DOI and URI specifications IDF Strategy meeting Bologna 2005.
Advertisements

LIS901N lecture 5: http URI and apache Thomas Krichel
LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
A Unified Approach to Combat Counterfeiting: Use of the Digital Object Architecture and ITU-T Recommendation X.1255 Robert E. Kahn President & CEO CNRI,
DDI3 Uniform Resource Names: Locating and Providing the Related DDI3 Objects Part of Session: DDI 3 Tools: Possibilities for Implementers IASSIST Conference,
RDF Schemata (with apologies to the W3C, the plural is not ‘schemas’) CSCI 7818 – Web Technologies 14 November 2001 Van Lepthien.
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
E © 2002 Dario Aganovic Resource Description Framework Schema (RDFS) Dario Aganovic Industrial PhD-student NPI Production Kista, Ericsson AB and Production.
CS570 Artificial Intelligence Semantic Web & Ontology 2
4.01 How Web Pages Work.
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research.
RDF Tutorial.
CS0007: Introduction to Computer Programming Console Output, Variables, Literals, and Introduction to Type.
Persistent identifiers – an Overview Juha Hakala The National Library of Finland
URI IS 373—Web Standards Todd Will. CIS Web Standards-URI 2 of 17 What’s in a name? What is a URI/URL/URN? Why are they important? What strategies.
MOBILITY SUPPORT IN IPv6
Grid Computing, B. Wilkinson, 20043a.1 WEB SERVICES Introduction.
Jacob Boston Josh Pfeifer. Definition of HyperText Transfer Protocol How HTTP works How Websites work GoDaddy.com OSI Model Networking.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
LC and the W3C: History b Attended two W3C Workshops Indexing/Distributed Search Indexing/Distributed Search Query Language Query Language.
EPICUR Kathrin Schroeder ERPANET-Workshop „Persistent Identifiers“ (17th June 2004) Uniform Resource Names (URN) – Overview Die Deutsche Bibliothek.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.
3.02 The Information Superhighway
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
DATA COMMUNICATION DONE BY: ALVIN SAMPATH CARLVIN SAMPATH.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Internet Basics Dr. Norm Friesen June 22, Questions What is the Internet? What is the Web? How are they different? How do they work? How do they.
Chapter 1 Internet & Web Basics Key Concepts Copyright © 2013 Terry Ann Morris, Ed.D. Revised 1/12/2015 by William Pegram 1.
Practical RDF Chapter 1. RDF: An Introduction
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Web HTTP Hypertext Transfer Protocol. Web Terminology ◘Message: The basic unit of HTTP communication, consisting of structured sequence of octets matching.
CPS120: Introduction to Computer Science
Information Interchange on the Semantic Web an interactive talk by Piotr Kaminski, University of Victoria
4395bis irireg Tony Hansen, Larry Masinter, Ted Hardie IETF 82, Nov 16, 2011.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Pete Johnston, Eduserv Foundation 16 April 2007 An Introduction to the DCMI Abstract Model JISC.
BZUPAGES.COM. Presented to: Sir. Muizuddin sb Presented by: M.Sheraz Anjum Roll NO Atif Aneaq Roll NO Khurram Shehzad Roll NO Wasif.
CSI 3125, Preliminaries, page 1 Networking. CSI 3125, Preliminaries, page 2 Networking A network represents interconnection of computers that is capable.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lotzi Bölöni.
Interoperability How to Build a Digital Library Ian H. Witten and David Bainbridge.
CPS120: Introduction to Computer Science Variables and Constants.
The Akoma Ntoso Naming Convention Fabio Vitali University of Bologna.
Copyright © 2004, Keith D Swenson, All Rights Reserved. OASIS Asynchronous Service Access Protocol (ASAP) Tutorial Overview, OASIS ASAP TC May 4, 2004.
DC Architecture WG meeting Wednesday Seminar Room: 5205 (2nd Floor)
OBJECT-ORIENTED TESTING. TESTING OOA AND OOD MODELS Analysis and design models cannot be tested in the conventional sense. However, formal technical reviews.
THE LARGEST NAME SERVICE ACTING AS A PHONE BOOK FOR THE INTERNET The Domain Name System click here to next page 1.
Course on persistent identifiers, Madrid (Spain) Information architecture and the benefits of persistent identifiers Greg Riccardi Director Institute for.
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
1 UNIT 13 The World Wide Web. Introduction 2 Agenda The World Wide Web Search Engines Video Streaming 3.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data & Semantic Web Technology The Semantic Web Part 4. Resource Description Framework (1) Dr. Myungjin Lee.
The Semantic Web By: Maulik Parikh.
Building the Semantic Web
CmpE 583- Web Semantics: Theory and Practice PRINCIPLES
Naming in Distributed Web-based Systems
Chapter 2 Database Environment.
RDF 1.1 Concepts and Abstract Syntax
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
William Stallings Data and Computer Communications
Unit-3.
New Perspectives on XML
Presentation transcript:

LIS901N: URI Thomas Krichel ??

URIs (background) URI: uniform resource identifier Originally, a generalization of: –URL (uniform resource locator), –URN (uniform resource name), –URC (uniform resource citation), –and potentially others, but mainly, URL and URN

The difference (in theory) between URL and URN: a URL is bound to a location –when resource moves, url changes a URN is a name –thus location independent, and, in theory, persistent (whatever persistent means)

The Other View Distinction between URL and URN is artificial Both terms should be abolished and replaced by URI thus all identifier schemes would be URI schemes (even http) and no prefix would be necessary (URL, URN, or even URI).

Reasoning Original URI philosophy: –URLs were a short-term solution and URNs long-term. –URL would be a temporary identification mechanism until a location-independent, persistent identifier was developed, the URN. Now it seems: –URNs wont be any more persistent than URLs. –persistence is a social problem, not a technical problem

URI vs URL The term URL or Universal Resource Locator is not used in standards anymore. It generally means a URI that contains a domain-name but it is historical only. This presentation uses the term URI exclusively. The term URL is still sufficient to convey the meaning but should not be used when precision is necessary.

What does a URI identify? A URI identifies a Resource. A URI only comes into existence when it is bound to a Resource. A Resource is defined as anything that is identified by a URI. Resources only come into existence when a URI is bound to it. A URI cannot exist without a Resource. A Resource cannot exist without a URI.

it all comes from Plato The URI identifies an abstract Resource formalism assumes the Platonic concept of form. A Resource, once bound to a URI and brought into existence, is only the abstract essence of the real world thing we perceive. Any physical or digital version of that Resource is only one of all possible physical representations of that Resource. For example, is a URI for a homepage. Using language and content negotiation it is possible to request that page in many languages and formats. Which version is the Resource? Answer: none of them. Each is only a representation. It is possible to assign a URI to even the representations. But even still, each Resource is only the abstraction of the physical or digital thing, not the thing itself.

What is resolution? Resolution means accessing some representation of the Resource that a URI identifies. –For it means accessing the homepage of foo.com –For it can mean sending an message to that address. For URIs that contain network location information it is simply a matter of visiting that location and doing some function. I.e. foo.com is the exact network host that can give you the web page.

The history Tim Berners-Lee came to the IETF in 1992 to develop the WorldWideWeb standards. At the time URIs were known as Universal Resource Locators. RFC 1738 Uniform Resource Locators (URL) was published in RFC 1738 was updated by RFC 1808, RFC 2368, RFC RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax is the current standard. RFC 2396 may be updated to reflect developments in internationalization, terminology updates, and registration procedures.

Confusion… Due to misunderstandings and the formation of the W3C separately from the IETF, there was a long term disagreement on certain aspects of URIs, especially when it came to Uniform Resource Names (URNs). A join IETF/W3C URI Interest Group was formed in 2000 to investigate work that needed to be done with URIs in general. That group published URIs, URLs, and URNs: Clarifications and Recommendations Report from the joint W3C/IETF URI Planning Interest Group (draft- mealling-uri-ig-01.txt ) which begins to clarify the problems and proposes solutions.

URN Uniform Resource Names Are defined by RFC 2141 as a particular URI scheme with these characteristics: –Permanent – Once a URN is assigned to some Resource it can never be re-assigned to something else. –Location Independent – The actual URN should not contain any network location information such as domain-names, IP addresses, file path-names, etc.

RFC2396 Berners-Lee, Tim Roy T. Fielding and Larry Masinter (1998) ``Uniform Resource Identifiers (URI): Generic Syntax'', rfc2396 A Uniform Resource Identifier (URI) is a compact string of character for identifying an abstract or physical resource. They provide a simple and extensible means for identifying a resource.

operations on a URI There is a set of operations that can be applied to URIs. For example, for a URL, the access to the resource. To understand if a given URI instance is valid, we have to study the operations applied to URIs.

benefits of uniformity It allows different type of resource identifiers to be used in the same context, even when the mechanisms used to access those resources may differ it allows uniform semantic interpretation of common syntactic conventions across different types of resource identifiers it allows introduction of new types of resource identifiers without interfering with the way that existing identifiers are it allows the identifiers to be reused in many different contexts, thus permitting new applications or protocols to leverage a pre-existing, large, and widely-used set of resource identifiers.

Resources and Identity in the RFC A resource can be anything that has identity. Not all resources are network ``retrievable''. The resource is the conceptual mapping to an entity or set of entities, not necessarily the entity which corresponds to that mapping at any particular instance in time. An identifier is an object that can act as a reference to something that has identity. In the case of URI, the object is a sequence of characters with a restricted syntax.

URI, URL, & URN in the RFC A URI can be further classified as a locator, a name, or both. The term ``Uniform Resource Locator'' (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network location), rather than identifying the resource by name or by some other attribute(s) of that resource. The term ``Uniform Resource Name'' (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable.

URN in the RFC A URN differs from a URL in that it's primary purpose is persistent labeling of a resource with an identifier. That identifier is drawn from one of a set of defined namespaces, each of which has its own set name structure and assignment procedures. The urn scheme has been reserved to establish the requirements for a standardized URN namespace, as defined in URN Syntax RFC2141 and its related specifications.

transcribability The URI syntax was designed with global transcribability as one of its main concerns. A URI is a sequence of characters from a very limited set, i.e. the letters of the basic Latin alphabet, digits, and a few special characters. A URI may be represented in a variety of ways.

consequences of transcribability A URI is a sequence of characters, which is not always represented as a sequence of octets. A URI may be transcribed from a non-network source, and thus should consist of characters that are most likely to be able to be typed into a computer, within the constraints imposed by keyboards (and related input devices) across languages and locales. A URI often needs to be remembered by people, and it is easier for people to remember a URI when it consists of meaningful components.

URI characters URI consist of a restricted set of characters, nota sequence of octets. The allowable characters primarily chosen to aid transcribability and usability both in computer systems and in non-computer communications. Characters used conventionally as delimiters around URI are excluded. In the simplest case, the original character sequence contains only characters that are defined in US-ASCII, and the two levels of mapping are simple and easily invertible: each 'original character' is represented as the octet for the US-ASCII code for it, which is, in turn, represented as either the US-ASCII character.

reserved characters Many URI include components consisting of or delimited by, certain special characters. These characters are called ``reserved'', since their usage within the URI component is limited to their reserved purpose. If the data for a URI component would conflict with the reserved purpose, then the conflicting data must be escaped before forming the URI. they are ; / ? & = + $, They are allowed within a URI, but which may not be allowed within a particular component of the generic URI syntax.

unreserved & excluded characters Those are the characters that are allowed and never take any special meaning. They are – the upper and lowercase letters a to z and A to Z –the decimal digits 0 to 9 –the following: - _. ! ~ * ( ) All characters that are not reserved or unreserved are excluded – # % { } | ^ [ ] ` –and the blank are excluded. They have to be escaped.

escaping When you want to use a character in a URI that not one of the excluded characters, you have to escape it The way that this done is to write a construction of the form % hex hex where hex is a digit or the letters a to f (uppercase or lowercase). The two hex characters represent the value of the character in unicode in hex. For example %7eis the character ~

The Semantic Web The W3C has been developing a new architecture that applies knowledge representation technology to the WWW. Using the Resource Description Framework (RDF), Statements are made using a Subject, Predicate and Object (very similar to Lisp and other predicate based languages). Each Subject, Predicate or Object are Resources in the URI sense and are identified by URIs within an RDF Statement using XML Namespaces.

example This statement says that the Resource identified by the URI was created by the person Thomas Krichel: Ora Lassila

The Semantic Web The combination of Web Services and the Semantic Web should give the Web the ability to turn any existing Web Resource into a full node in a purposefully built knowledge representation system with a functional component that allows that knowledge to be acted on. And both are based on the simple Uniform Resource Identifier.

Thank you for your attention!