Role of RSS in Science Publishing- NPG And Delivering OAI Records as RSS By Aparna R. Belhe Archana Galipalli
Why RSS in Science Publishing? RSS is used for News, Weather updates, Entertainment sites, Craig list, Blogs, Universities So Why not in Science Publishing?? RSS can be used for online publications, web debates, scientific job postings, latest journal postings
RSS Vs Traditional Web Traditional Web – shared collaboratory; RSS – flat, read-only kaleidoscope RSS is antithesis of traditional web RSS – synopsis, snapshot, signal of change of website 3CS791- Web Syndication Formats
Data Syndication using RSS All RSS standards – specifications for XML documents (titles, links and description) Aggregators pull down documents available on web servers Standard HTTP protocol required RSS Readers/aggregators alert users when new content detected 4CS791- Web Syndication Formats
RSS Lineage Figure taken from: Really Simple Syndication – syndicates ephemeral content like news headlines and blog entries RDF Site Summary – means of exchanging structured metadata; provides simple modular extension mechanism 5CS791- Web Syndication Formats
Metadata Annotation using RSS Because of RDF pedigree, RSS 1.0 suited for metadata inclusion Vocabulary to include metadata elements – Dublin Core -Provides basic interoperability with 15-element data set -No level of granularity -Discrete bibliographic elements cannot be codified PRISM (Publishing Requirements for Industry standard metadata) - describes content assets from trade serials publications for syndication 6CS791- Web Syndication Formats
PRISM …ctd PRISM ( ) - defines small set of vocabularies - address industry requirements for resource discovery - defines additional basic vocabulary (50 terms) along with Dublin Core - Specialized vocabularies - dealing with rights, inline markup, and controlled vocabularies - Our interest – bibliographic metadata fields like issn volume, number, startingPage etc 7CS791- Web Syndication Formats
Metadata Annotation using RSS ……..Extension Modules Provides guidelines for usage of vocabulary Two new PRISM elements for science publishers endingPage eIssn 8CS791- Web Syndication Formats
Metadata Annotation using RSS ……..Extension Defined small set of terms for job advertisements Vocabulary consists of - offeredBy, city, country, postedOn, expiresOn, etc 9CS791- Web Syndication Formats
Example Comparison between a job posting on NPG and Craiglist NPG: Craiglist:
More Modules ContextObject enables contextual info along with the body of RSS feed Downstream application using this feed can provide the consumer of the feed with context- sensitive services 11CS791- Web Syndication Formats
Example of Context Object – Flattened Form
Example of Context Object – Hierarchical Form
What publishers are using RSS for? Alerting service for tables of content information Sending out notifications of new issues News services—jobs, product data, events Maintenance of Archives of RSS Feeds NPG – RSS Feeds 14CS791- Web Syndication Formats
Need for Aggregator Information providers must write custom code to publish RSS feeds Hard for non-programmers to merge and filter RSS feeds Harder still for publishers to set up RSS based news aggregation services for particular areas of interest Develop an Open Source, Web-based RSS aggregator and filter. 15CS791- Web Syndication Formats
Urchin – An RSS Aggregator Developed using open-source software components - Linux, Apache, MySQL and Perl Successfully ported to run on other Unix'es (Mac OS X, as well as the Windows Unix emulation layer Cygwin) and on Windows No specific web-server required 16CS791- Web Syndication Formats
Urchin – Basic Functionality ingest information from data sources (all RSS flavors, Atom, HTML Pages, databases) Store this info internally Emit on request, a filtered information set in selected information format 17CS791- Web Syndication Formats
Overview of Urchin Functionality 18CS791- Web Syndication Formats
NPG uses urchin for - Provision of keyword-filtered RSS feeds for its staff Population of a new technology and publishing news portal Picking out stories mentioning a certain word from among its sources Eg: particular value for dc:creator, dc:date, dc:subject 19CS791- Web Syndication Formats
Concluding Urchin… The RSS aggregator Urchin, has proved to be a useful building block for new applications Not just an academic research tool but has already been pressed into commercial use Urchin is open-source code, deposited at SourceForge, and is freely available for others to make use of 20CS791- Web Syndication Formats
AN IMESH TOOLKIT MODULE FOR FACILITATING RESOURCE SHARING Delivering OAI Records as RSS 21CS791- Web Syndication Formats
OAI-PMH Low-barrier interoperability framework Service Providers Data Providers Open Archives Initiative Protocol for Metadata Harvesting – OAI Protocol Client-Server architecture Uses XML over HTTP Commercial search engines have started using OAI- PMH to acquire more resources 22CS791- Web Syndication Formats
The IMesh Toolkit Project IMesh Toolkit project announced at the Warwick workshop. UKOLN, University of Bath. ILRT, University of Bristol. Internet Scout Project, University of Wisconsin, Madison. 23CS791- Web Syndication Formats
Aims To develop framework for subject gateways which supports the reuse of tools and metadata To provide framework for interoperability between subject gateways and between gateways and other services 24CS791- Web Syndication Formats
Motivation Interest towards embedding subject gateway content into other Websites. RDN-Include ̴ Google Toolbar The IMesh Toolkit Project 25CS791- Web Syndication Formats
The correspondence between record metadata and an RSS item 26CS791- Web Syndication Formats Fig from “
The IMesh Toolkit Module 27CS791- Web Syndication Formats Fig from “
The IMesh Toolkit …Reading List Module Fig. from “
The IMesh Toolkit Module (ctd…) written in Perl outputs an RSS file which contains a list of reading list materials. Individual service should decide the design of the interface. Module is independent of the technologies. The RSS file can be - edited using an RSS editor presented to users displayed in Web pages 29CS791- Web Syndication Formats
Interoperability Issues schema mandated by Open Archives Initiative Protocol for Metadata Harvesting contains 3 fields of description (dc:title, dc:identifier, dc:description) in the RSS item description which can be reused. Difficulties in defining the URL of the resource that the record describes: Since multiple IDs are given. dc:identifier is a link to the OAI repository identifier. 30CS791- Web Syndication Formats
Guidelines for Repository Implementers Minimal Repository Implementation Dublin Core and Other Metadata Formats Containers Sets Response Compression Flow Control CS791- Web Syndication Formats31
Conclusion Sharing of subject gateway content is achieved by IMesh Toolkit module. These interfaces and the standards provides the basis for interoperation. OAI-MPH to RSS mapping can be done. adinglists/dlesetrial.html adinglists/dlesetrial.html 32CS791- Web Syndication Formats
References 2hammond.htmlhttp:// 2hammond.html adinglists/module/ adinglists/module/ 33CS791- Web Syndication Formats
34CS791- Web Syndication Formats