An XML Web Publishing Framework From the Apache Project

Slides:



Advertisements
Similar presentations
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
Advertisements

Manakin Workshop DSpace User Group, February 2006 Scott Phillips Texas A&M University
Object-Oriented Enterprise Application Development Tomcat 3.2 Configuration Last Updated: 03/30/2001.
Servlets and a little bit of Web Services Russell Beale.
DT211/3 Internet Application Development
Java Server Pages Russell Beale. What are Java Server Pages? Separates content from presentation Good to use when lots of HTML to be presented to user,
UNIT-V The MVC architecture and Struts Framework.
Overview of JSP Technology. The need of JSP With servlets, it is easy to – Read form data – Read HTTP request headers – Set HTTP status codes and response.
DAT602 Database Application Development Lecture 15 Java Server Pages Part 1.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
JSP Standard Tag Library
1 Understanding Cocoon2 Pankaj Kumar May 24, 2001 Web Services Organization HPMD, SSO, HP.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
SDPL 2002Notes 7: Apache Cocoon1 7 XML Web Site Architecture Example: Apache Cocoon, a Web publishing architecture based on XML technology
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
From Creation to Dissemination A Case Study in the Library of Congress’s use Open Source Software DLF Spring Forum Corey Keith
Copyright © Orbeon, Inc. All rights reserved. Erik Bruchez Applications of XML Pipelines XML Prague, June 16 th, 2007.
Building XML Portals with Cocoon M atthew Langham S&N AG
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
JAVA SERVER PAGES. 2 SERVLETS The purpose of a servlet is to create a Web page in response to a client request Servlets are written in Java, with a little.
JAVA SERVER PAGES CREATING DYNAMIC WEB PAGES USING JAVA James Faeldon CS 119 Enterprise Systems Programming.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
March 28, 2001XSP Session O’Reilly Enterprise Java Conference 1 XSP Session Sue Spielman President/Consulting Engineer President/Consulting Engineer
SDPL 2001Notes 7.2: Apache Cocoon1 7.2 Apache Cocoon An example of a Web publishing architecture based on XML technology An.
Nate Trail Network Development & MARC Standards Office 8/1/2006 With help from Sydney Olive How to Build, Display and Find METS Objects.
METS Dissemination METS Opening Day Corey Keith
PatentScope - Electronic Publication World Intellectual Property Organization.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
AxKit A member of the Apache XML project Ryan Maslyn Kyle Bechtel.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Java Web Server Presented by- Sapna Bansode-03 Nutan Mote-15 Poonam Mote-16.
©2001 Priority Technologies, Inc. All Rights Reserved Meteor Status Miami Face to Face Meeting January 16 – 18, 2002.
Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.
Basic web application development with Apache Cocoon Basic web application development with Apache Cocoon 2.1 Jasha Joachimsthal Jeroen.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Apache Cocoon – XML Publishing Framework 데이터베이스 연구실 박사 1 학기 이 세영.
BOF-1147, JavaTM Technology and WebDAV: Standardizing Content Management Java and WebDAV Juergen Pill Team Leader Software AG Remy Maucherat Software Engineer.
The Mechanics of HTTP Requests and Responses and network connections.
Distributed Control and Measurement via the Internet
Unit 4 Representing Web Data: XML
JSP: Actions elements and JSTL
institutional repositories and desktop silos
WWW and HTTP King Fahd University of Petroleum & Minerals
JSP (Java Server Page) JSP is server side technology which is used to create dynamic web pages just like Servlet technology. This is mainly used for implementing.
WEB SERVICES From Chapter 19 of Distributed Systems Concepts and Design,4th Edition, By G. Coulouris, J. Dollimore and T. Kindberg Published by Addison.
Software Design and Architecture
Web Engineering.
Processes The most important processes used in Web-based systems and their internal organization.
By Dr. Kodge Bheemashankar G
Department of Computer Science Homepage
PHP / MySQL Introduction
Searching Business Data with MOSS 2007 Enterprise Search
Design and Maintenance of Web Applications in J2EE
Chapter 7 Representing Web Data: XML
MSIS 655 Advanced Business Applications Programming
Building an Integrable XBRL Portal Daniel Hamm German Central Bank
Chapter 27 WWW and HTTP.
Chapter 9 Web Services: JAX-RPC, WSDL, XML Schema, and SOAP
A Match Made In (Ethereal) Heaven
XML Problems and Solutions
Teaching slides Chapter 6.
HYPERTEXT PREPROCESSOR BY : UMA KAKKAR
J2EE Lecture 1:Servlet and JSP
Introduction to World Wide Web
WEB SERVICES From Chapter 19, Distributed Systems
INFS 230 L Internet Technology
CSE591: Data Mining by H. Liu
Web Application Development Using PHP
Extensible Markup Language (XML)
Presentation transcript:

An XML Web Publishing Framework From the Apache Project Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer

Today’s Topics: Definitions Motivation Required Tools (Java, Apache Tomcat and Cocoon) Basic Cocoon Operation Matchers, Generators, Transforms and Serializers. Oh My! sitemap.xml glues it all together. 6 August 2002 OAR Web Shop

Cocoon An XML-based WWW publishing framework implemented as a Java Servlet. Web site content stored in XML files (or RDBMS, LDAP Server or other source) is transformed (mostly via XSLT) into new XML files (to exclude certain info for example) and then serialized into human usable output (like an HTML or PDF file). 6 August 2002 OAR Web Shop

Reusable Content 6 August 2002 OAR Web Shop

Motivation for using Cocoon We distribute climate data Users (including scientists) find data via public search engines like google Public search engines index HTML content NOAA and other scientific organization use special purpose search engines that use FGDC (or DIF derived from FGDC) 6 August 2002 OAR Web Shop

Motivation continued These facts add up to maintaining separate “documents” for each purpose XML and Cocoon offers a (yet another potential) way out of the morass of many special purpose document collections 6 August 2002 OAR Web Shop

Suppose info was stored as XML <page> <title>Reynolds Sea Surface Temperature </title> <prefix>data.sst</prefix> <abstract> <para> The optimum interpolation (OI) SST analysis… </abstract> <contact> <name>CDC Data Management Personel</name> <address1>325 Broadway</address1> <phone>(303) 497-6244</phone> <email>cdcdata@cdc.noaa.gov</email> </contact> … </page> 6 August 2002 OAR Web Shop

The Power of XML Content Can be parsed with standard XML tools Can be easily used for another purpose besides the Web Can be written with powerful XML GUI tools (e.g. XML spy) (Might be) easier to maintain 6 August 2002 OAR Web Shop

Reusable Content 6 August 2002 OAR Web Shop

Schematic of the Solution Using Cocoon Cocoon Some other process 6 August 2002 OAR Web Shop

Required Tools On Solaris 7 and 8 I have used the binary distributions of: Java 1.4.0 (java.sun.com) Tomcat 4.0.4 (www.apache.org) Cocoon 2.0.3 (xml.apache.org) At this time, these are the latest releases. Follow the installation instructions for each package. 6 August 2002 OAR Web Shop

Basic Operation Cocoon is based on pipelines: A Bit of Software XML File New XML File A Bit of Software New XML File Info to client (e.g HTML to browser) 6 August 2002 OAR Web Shop

Basic Operation Cocoon is based on pipelines. An XML document is pushed through a pipeline consisting of one Generator (read a file, create a file from an LDAP server, etc.), zero or more Transforms (for example, to leave out sensitive information for external users) and ends with a Serializer that transforms the XML to binary or character data for consumption by the client (Web browser). The entire site could use only one pipeline. 6 August 2002 OAR Web Shop

Basic Operation If you need more than one pipeline… Matchers (wildcard and regular expression) and Selectors (Boolean expressions) can be used to control the pipeline used to process the XML content. 6 August 2002 OAR Web Shop

Components Matchers, Generators, Transforms and Serializers are all Cocoon Components. Pipelines are build out of Components. Components are declared and pipelines are constructed in the sitemap.xmap file. The “Bit of Software” needed for each Component is provided by Cocoon or built by you. 6 August 2002 OAR Web Shop

Components (Matchers) Suppose you wanted these URI patterns to be handled by cocoon: For example the wildcard patterns: http://www.cdc.noaa.gov/cocoon/data/*.html and http://www.cdc.noaa.gov/cocoon/data/*.pdf could result in two pipelines with two different outputs types. 6 August 2002 OAR Web Shop

Components (Matchers) Need a “bit of software” that looks at: http://www.cdc.noaa.gov/cocoon/data/data.sst.html Matches the the URL www.cdc.noaa.gov/cocoon/data And the extension “.html” Extracts the wildcard part of the URL data.sst Starts the pipeline to produce HTML output from the data.sst.xml file (the wildcard plus the .xml extension). 6 August 2002 OAR Web Shop

The WildCard Matcher We’re in luck! A Matcher Component already exists in Cocoon to do what we want. To use a Component we must declare it in the sitemap.xmap file that controls our Cocoon installation. 6 August 2002 OAR Web Shop

Declare the WildCard Matcher In sitemap.xmap configuration file: <map:matchers default=“wildcard”> <map:matcher name=“wildcard” src= "org.apache.cocoon.matchingWildcardURIMatcher"/> … </map:matchers> 6 August 2002 OAR Web Shop

Use the Matcher on a URI We’ve declared the Matcher Component Use the Matcher component in our pipeline to grab the * part of the pattern and use it to specify the source XML file that will be send through the pipeline. 6 August 2002 OAR Web Shop

Use the Matcher in a Pipeline This pipeline uses the default Matcher, which is the WildCard Matcher we declared in the previous slide <map:match pattern=“data/*.html"> <map:generate src=" data/{1}.xml"/> 6 August 2002 OAR Web Shop

Now What? We have successfully declared and used a Matcher to decide which pipeline we will use to process the first of our two examples URIs. Now we need to declare and use a Generator, which is always the first step of the pipeline. 6 August 2002 OAR Web Shop

Components (Generators) Declare a generator in sitemap.xmap: <map:generators default=“file”> <map:generator name=“file” src= “org.apache.cocoon.generationFileGenerator”/> … </map:generators> 6 August 2002 OAR Web Shop

Use the Generator in a Pipeline The File Generator was declared as the default. Its only job is to read the a file from the file system. <map:pipelines> <map:pipeline> <match pattern=“data/*.html”> <map:generate src=“data/{1}.xml”/> … 6 August 2002 OAR Web Shop

Review: Matcher and Generator Components (Matchers) Need a “bit of software” that looks at: http://www.cdc.noaa.gov/cocoon/data/data.sst.html Matches the the URL www.cdc.noaa.gov/cocoon/data And the extension “.html” Extracts the wildcard part of the URL data.sst Starts the pipeline to produce HTML output from the data.sst.xml file (the wildcard plus the .xml extension). 6 August 2002 OAR Web Shop

Review: Pipeline Components Conditional use of pipeline via the Matcher One Generator (FileGenerator) Zero or more Transforms (?) Ends with a Serializer (?) 6 August 2002 OAR Web Shop

Components (Transforms) Declare a Transform: <map:transformers default="xslt"> <map:transformer name="xslt“ src="org. apache.cocoon.transformation.TraxTransformer"> <use-request-parameters> false </use-request-parameters> <use-browser-capabilities-db> </use-browser-capabilities-db> </map:transformer> 6 August 2002 OAR Web Shop

The XSLT Transformer <use-request-parameters> Different from previous declarations we’ve seen. This declaration includes two additional configuration parameters. <use-request-parameters> <use-browser-capabilities-db> 6 August 2002 OAR Web Shop

Add the Transformer to Pipeline <map:match pattern="*.html"> <map:generate src=" {1}.xml"/> <map:transform src=“datastyle/HTMLstyle.xsl"/> 6 August 2002 OAR Web Shop

The Stylesheet written in XSLT: <HTML> <HEAD> <TITLE><xsl:value-of select="/page/title"/></TITLE> </HEAD> <BODY> … <xsl:template match="/page/abstract"> <h2>Abstract:</h2> <xsl:apply-templates select="para"/> </xsl:template> 6 August 2002 OAR Web Shop

Components (Serializers) The last step of each Pipeline is a Serializer It consumes XML (in the form of SAX events) and generates a character stream for a client (Web browser, Acrobat Reader, etc.). 6 August 2002 OAR Web Shop

Declare the Serializer In sitemap.xmap: <map:serializers default="html"> <map:serializer mime-type="text/html" name="html" src=“...HTMLSerializer"> <buffer-size>1024</buffer-size> </map:serializer> 6 August 2002 OAR Web Shop

The Completed Pipeline <map:match pattern=“data/*.html"> <map:generate src=“data/{1}.xml"/> <map:transform src=“datastyle/HTMLstyle.xsl"/> <map:serialize/> </map:match> 6 August 2002 OAR Web Shop

Pipeline to make PDF output <map:match pattern=“data/*.pdf"> <map:generate src=“data/{1}.xml"/> <map:transform src="stylesheets/FOstyle.xsl"/> <map:serialize type="fo2pdf"/> </map:match> 6 August 2002 OAR Web Shop

6 August 2002 OAR Web Shop

http://www.cdc.noaa.gov/cocoon/data/data.sst.html 6 August 2002 OAR Web Shop

http://www.cdc.noaa.gov/cocoon/data/data.sst.pdf 6 August 2002 OAR Web Shop

The Dreaded Demo Demo Data Set Descriptions at CDC. 6 August 2002 OAR Web Shop

Cocoon is all this and more! Action Components to do complex initialization (e.g. get database connection pool) during pipeline setup. Resource Components are internal reusable pipeline fragments. XSP and Logic Sheets offer capabilities similar to JSP with further separation of the logic. 6 August 2002 OAR Web Shop

Resources www.apache.org Inside XSLT by Steven Holzner (New Riders) Java and XSLT by Eric M. Burke (O’Reilly) 6 August 2002 OAR Web Shop

Reality Check! We have not (yet) put this system in production. Still designing the XML representation. Still learning about using Cocoon with a relational database. Considering using XSP pages. 6 August 2002 OAR Web Shop

Conclusions Cocoon offers the potential to use and reuse one bit of XML content for many purposes. Most operations for Web hosting the XML content are built-in to Cocoon. Unlimited customization by writing your own Components. Content is easily maintained and separated from presentation. 6 August 2002 OAR Web Shop