Venkata Krishna Potta and Ketan Reddy Peddabachi OAI-ORE Venkata Krishna Potta and Ketan Reddy Peddabachi Instructor: Dr. Michael Nelson 1/14/2019 CS 791/891 "Web Syndication Formats"
Presentation Overview Introduction The ORE Model Resource Map Implementation in Atom ReM Demo Discovery of ReM’s Conclusion 1/14/2019 CS 791/891 "Web Syndication Formats"
OAI –Open Archive Initiative Initially OAI started with an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication Now opening up access to a range of digital materials. Major Projects: OAI-ORE and OAI-PMH 1/14/2019 CS 791/891 "Web Syndication Formats"
Open Archives Initiative-Object Reuse and Exchange (OAI-ORE) OAI-ORE defines standards for the description and exchange of aggregations of Web resources. Reference ORE will develop specifications that allow distributed repositories to exchange information about their constituent digital objects. will include approaches for representing digital objects and repository services that facilitate access and ingest of these representations. will enable a new generation of cross-repository services that leverage the intrinsic value of digital objects beyond the borders of hosting repositories. 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Terms Resources - refers to any item of interest found on the web. Aggregation- a collection of related resources. Aggregated Resource- resource which is part of an aggregation 1/14/2019 CS 791/891 "Web Syndication Formats"
Examples of Aggregations: A collection of favorite images from various Web sites. My Images A blog consisting of related images and videos. cnn blog A multi-page, HTML document where the pages are linked together by hyperlinks that provide "previous page" and "next page" access. HTML Pages Information available from "social networking" sites, such as flickr, YouTube, and myspace. Photos 1/14/2019 CS 791/891 "Web Syndication Formats"
Aggregation and Aggregated resources The Web : Reference 1/14/2019 CS 791/891 "Web Syndication Formats"
Aggregation and Aggregated resources ReM Resource Aggregated Resource Aggregation Reference 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" The ORE Model Resource Map (ReM) A Resource that specifies a URI for an Aggregation Enumerates the constituents of Aggregation Describes the relationships among them Identification of a Resource Map (URI-R) Identification of an Aggregation (URI-A) fragment identifier dereference of URI-R. Example: the URI-R http://sample.org/ReM-1 Aggregation with URI-A http://sample.org/ReM-1#aggregation. 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" ReM Graph Reference Add reference 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" ReM-Aggregation Reference Add reference 1/14/2019 CS 791/891 "Web Syndication Formats"
Representing Metadata Reference 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Aggregated Resources Reference 1/14/2019 CS 791/891 "Web Syndication Formats"
Sharing of Aggregated Resources Reference 1/14/2019 CS 791/891 "Web Syndication Formats"
Serialization Formats The ORE Model can be implemented many serialization formats, 2 of them being: Atom RDF/XML 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Resource Map Implementation in Atom 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Building a ReM arXiv eprint example: http://arxiv.org/abs/astro-ph/0601007 Hypothetical ReM http://arxiv.org/rem/astro-ph/0601007 3 Aggregated Resources http://arxiv.org/ps/astro-ph/0601007 http://arxiv.org/pdf/astro-ph/0601007 http://arxiv.org/e-print/astro-ph/0601007 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Building a ReM… <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <id>tag:arxiv.org,2007:astro-ph/0601007v2</id> <category scheme="http://www.openarchives.org/ore/terms/" term="http://www.openarchives.org/ore/terms/ResourceMap" label="Resource Map" /> <link rel="describes" href="http://arxiv.org/rem/astro-ph/0601007#aggregation" /> <entry> <id>tag:arxiv.org,2007:astro-ph/0601007v2:pdf</id> </entry> <entry> <id>tag:arxiv.org,2007:astro-ph/0601007v2:ps</id> </entry> <entry> <id>tag:arxiv.org,2007:astro-ph/0601007v2:e-print</id> </entry> </feed> 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Building a ReM… <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <id>tag:arxiv.org,2007:astro-ph/0601007v2</id> <link href="http://arxiv.org/rem/astro-ph/0601007" rel="self" type="application/atom+xml"/> <category scheme="http://www.openarchives.org/ore/terms/" term="http://www.openarchives.org/ore/terms/ResourceMap" label="Resource Map" /> <link rel="describes" href="http://arxiv.org/rem/astro-ph/0601007#aggregation" /> <entry> <id>tag:arxiv.org,2007:astro-ph/0601007v2:ps</id> <link href="http://arxiv.org/ps/astro-ph/0601007" rel="alternate" type="application/postscript"/> </entry> …. </feed> 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Building a ReM… More Elements: /feed/title /feed/entry/title /feed/author http://www.openarchives.org/ore/0.2/atom-implementation#title 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Building a ReM… /feed/updated and /feed/entry/updated http://www.openarchives.org/ore/0.2/atom-implementation#updated 1/14/2019 CS 791/891 "Web Syndication Formats"
Common Scenarios for Aggregations Multiple Formats Mirror Copies Versions Splash Pages 1/14/2019 CS 791/891 "Web Syndication Formats"
Resource Map Discovery 1/14/2019 CS 791/891 "Web Syndication Formats"
Resource Map Discovery Crawlers or harvesters must discover Resource Maps (ReMs). Different discovery mechanisms : (A) Batch Discovery, (B) Resource Embedding & (C) Response Embedding 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" (A) Batch Discovery In this method agents can discover ReMs in groups. ReMs are represented in many formats, batch discovery can be applied to most of those ReMs i) ReMs in OAI-PMH ii) ReMs in SiteMaps iii) ReMs in Syndication Feeds 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" i)ReMs in OAI-PMH OAI-PMH Request: http://www.foo.edu/oai?verb=GetRecord&identifier=oai:foo.edu:object1&metadataPrefix=oai_rem <?xml version="1.0" encoding="UTF-8"?> <OAI-PMH xmlns=http://www.openarchives.org/OAI/2.0/ xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>2007-02-08T08:55:46Z</responseDate> <request verb="GetRecord" identifier="oai:foo.edu:object1" metadataPrefix="oai_rem">http://foo.edu/oai2</request> <GetRecord> <record> <header> <identifier>oai:foo.edu:object1</identifier> <datestamp>2007-01-06</datestamp> </header> <metadata> <!-- Insert ReM here --> </metadata> </record> </GetRecord> </OAI-PMH> 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" ii) ReMs in SiteMaps http://www.foo.edu/sitemap-rem.xml <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>http://www.foo.edu/objects/object1.atom</loc> <lastmod>2007-01-06</lastmod> </url> <loc>http://www.foo.edu/objects/object2.atom</loc> <lastmod>2007-08-11</lastmod> <changefreq>weekly</changefreq> <loc>http://www.foo.edu/objects/object3.atom</loc> <lastmod>2007-03-15T18:30:02Z</lastmod> <priority>0.3</priority> ... </urlset> 1/14/2019 CS 791/891 "Web Syndication Formats"
iii) ReMs in Syndication Feeds (Atom) http://www.foo.edu/all-rems.atom <?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>ReMs at www.foo.edu</title> <link href="http://www.foo.edu/" /> <link href="http://www.foo.edu/all-rems.atom" rel="self"/> <updated>2007-08-15T18:30:02Z</updated> <author> <name>John Doe</name> <email>johndoe@foo.edu</email> </author> <id>urn:uuid:60a76c80-d399-11d9-b91C-0003939e0af6</id> <entry> <title>ReM For Object1</title> <link href="http://www.foo.org/objects/object1.atom"/> <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id> <updated>2007-01-06T00:00:00Z</updated> </entry> <title>ReM For Object2</title> <link href="http://www.foo.org/objects/object2.atom"/> <id>urn:uuid:9a2cc699-ccba-9e8b-132e-91da394e9a5c</id> <updated>2007-08-11T00:00:00Z</updated> </feed> 1/14/2019 CS 791/891 "Web Syndication Formats"
iii) ReMs in Syndication Feeds (RSS 2.0) http://www.foo.edu/all-rems.rss <?xml version="1.0"?> <rss version="2.0"> <channel> <title>ReMs at www.foo.edu</title> <link>http://www.foo.edu/</link> <description>All of the Resource Maps for resources at www.foo.edu</description> <item> <title>ReM for Object 1</title> <link>http://www.foo.org/objects/object1.atom</link> <description>ReM for Object 1</description> <pubDate>Sat, 06 Jan 2007 00:00:00 GMT</pubDate> </item> <title>ReM for Object 2</title> <link>http://www.foo.org/objects/object2.atom</link> <description>ReM for Object 2</description> <pubDate>Sat, 11 Aug 2007 00:00:00 GMT</pubDate> </channel> </rss> 1/14/2019 CS 791/891 "Web Syndication Formats"
(B) Resource Embedding The link to the ReM is embedded into the webpage. Can be done using i) HTML Link Element ii) Showing ReMs in HTML pages. 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" Knowledge Levels Full knowledge the ReM is linked to by all resources in the aggregation. Indirect knowledge all but one of the resources in the aggregation link to a single,unique resource in the aggregation, which in turn links to the ReM. functionally the same as full knowledge, but likely to be useful in actual deployment Limited knowledge only a subset of the resources in the aggregation (typically just a single resource) link to the ReM, and the remainder of the resources have no links at all. Zero knowledge none of the resources in the aggregation link to a ReM. 1/14/2019 CS 791/891 "Web Syndication Formats"
i) HTML Link Element (Full Knowledge) <head> <title>Hello World.</title> <link href="http://example.net/hw.atom" type="application/atom+xml" rel="resourcemap" > </head> <body> <img src="hello.jpeg"> <img src="world.jpeg"> </body> </html> 1/14/2019 CS 791/891 "Web Syndication Formats"
ii) HTML Link Element (Indirect Knowledge) <head> <title>Chapter Twelve.</title> <link href="http://mybook.com/toc.html" type="text/html" rel="indirectresourcemap" > </head> <body> Welcome to chapter twelve... </body> </html> 1/14/2019 CS 791/891 "Web Syndication Formats"
ii) Showing ReMs in HTML pages Link to the ReM 1/14/2019 CS 791/891 "Web Syndication Formats"
(C) Response Embedding The HTTP response has the link to the ReM. (request): HEAD http://www.example.net/hello.jpeg HTTP/1.1 Host: www.example.net Connection: close (response): HTTP/1.1 200 OK Date: Sat, 26 May 2007 22:43:10 GMT Server: Apache/2.2.0 Last-Modified: Sat, 26 May 2007 19:32:04 GMT ETag: "c3596-816-92123500" Accept-Ranges: bytes Content-Length: 2070 Link: <http://example.net/hw.atom>; type="application/atom+xml"; rel="resourcemap" Content-Type: image/jpeg 1/14/2019 CS 791/891 "Web Syndication Formats"
CS 791/891 "Web Syndication Formats" References http://www.openarchives.org/ore/meetings/hopkins/agenda.htm http://www.openarchives.org/ore/0.2/toc 1/14/2019 CS 791/891 "Web Syndication Formats"