Markup for Statisticians An Introduction to Alphabet Soup.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

CSCI N241: Fundamentals of Web Design Copyright ©2004 Department of Computer & Information Science Introducing XHTML: Module B: HTML to XHTML.
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
XSLT 11-Apr-17.
1 XSLT – eXtensible Stylesheet Language Transformations Modified Slides from Dr. Sagiv.
An Introduction to XML Based on the W3C XML Recommendations.
History Leading to XHTML
1 CP3024 Lecture 9 XML revisited, XSL, XSLT, XPath, XSL Formatting Objects.
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Pemrograman Berbasis WEB XML part 2 -Aurelio Rahmadian- Sumber: w3cschools.com.
Sheet 1XML Technology in E-Commerce 2001Lecture 6 XML Technology in E-Commerce Lecture 6 XPointer, XSLT.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Another PillowTalk Presentation  2004 Dynamic Systems, Inc. Introduction to XML for SOA Lee H. Burstein,
XML: The Changing Phase of e-Documentation Jyothi Jandhyala.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
3 XHTML.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XHTML. Introduction to XHTML What Is XHTML? – XHTML stands for EXtensible HyperText Markup Language – XHTML is almost identical to HTML 4.01 – XHTML is.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Introduction to XML Extensible Markup Language. What is XML XML stands for eXtensible Markup Language. A markup language is used to provide information.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
XML About XML Things to be known Related Technologies XML DOC Structure Exploring XML.
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
JSTL, XML and XSLT An introduction to JSP Standard Tag Library and XML/XSLT transformation for Web layout.
Presentation Topic: XML and ASP Presented by Yanzhi Zhang.
XP Tutorial 9 1 Working with XHTML. XP SGML 2 Standard Generalized Markup Language (SGML) A standard for specifying markup languages. Large, complex standard.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
1 Credits Prepared by: Rajendra P. Srivastava Ernst & Young Professor University of Kansas Sponsored by: Ernst & Young, LLP (August 2005) XBRL Module Part.
Advanced Technical Writing 2006 Session #4. Today in Class… ► Meet with your editorial team, refine/post deliverables ► Send URL for deliverables to Bill.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
XML A Language Presentation. Outline 1. Introduction 2. XML 2.1 Background 2.2 Structure 2.3 Advantages 3. Related Technologies 3.1 DTD 3.2 Schemas and.
Unit 3 — Advanced Internet Technologies Lesson 11 — Introduction to XSL.
IT Accessibility Committee XML as Content Management Presented by Michael B. Short May 11, 2006 Prepared by the NYS Forum IT Accessibility Committee
Martin Kruliš by Martin Kruliš (v1.1)1.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
XML Introduction to XML Extensible Markup Language.
XML Schema – XSLT Week 8 Web site:
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
1 Extensible Stylesheet Language (XSL) Extensible Stylesheet Language (XSL)
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
Unit 4 Representing Web Data: XML
XML QUESTIONS AND ANSWERS
Introduction to XHTML.
XML in Web Technologies
Database Processing with XML
Chapter 7 Representing Web Data: XML
Introducing HTML & XHTML:
Presentation transcript:

Markup for Statisticians An Introduction to Alphabet Soup

WWW In the 1980’s the world wide web (WWW) came in to being for documentation on most projects that impact on the WWW look at – a major factor in its success was the notion of a markup language

WWW the technology hurdle that this overcame was the separation of content from presentation a web browser is responsible for understanding and rendering the content in a web page

WWW that content is marked-up using HTML (or a relative) on IE5, under the View menu you will find an option for source select a web page and view the source

WWW this is very nice – now everyone can view your page using any browser all the browser has to do is to understand and implement a number of HTML directives notions such as linking (directing people to another place by a click) etc are easily implemented in this frame work

WWW NCSA puts out a document entitled –A Beginner’s Guide to HTML this is one of the better guides I have seen There are many books (most much longer than they need to be) O’Reilly’s HTML Pocket Reference seems pretty useful

WWW How does it work? Your web browser opens a special type of connection (an http connection usually) to another computer and through that protocol asks for the information on a particular web page Other types of connections, such as ftp, are also generally supported

WWW now we have solved the problem of how to put content onto your computer how do we solve the problem of providing programs or applications to perform some computations? this is where Java came in Java is a language that has a strong security model

WWW: Java Java applets can be secured in the sense that you can determine before you run them that they will do nothing harmful to your computer if you could not ensure that you would be ill advised to run an applet this is why there are no C or C++ applets they can be written but no one should be silly enough to run one

WWW while all web browsers use http as their basic means of transferring data other programs can also use http now the web is full of information about all sorts of topics how do we begin to make sense of that information?

WWW HTML has a severe limitations these became apparent when search engines were first being developed the problem is that there is no way to indicate the meaning of any of the information for example consider the tags that you have available for a table

WWW: Table tags the table tags are: –,,,, and –a few more in HTML 4.0 except by convention there is no way to indicate the content of the table but tables often contain data – data that we want to use without information on content it is hard to use the data programmatically

WWW we want to have smart programs there is no sense in having people find and manipulate data – if it is on the web it would be nice if it were in a format that a program could deal with the more we can automate the more we can do

WWW and R open R and look at the manual page for connections look at URL connections we want to open a connection to Leo Breiman’s home page –bhp <- url(“ open=“r”) –bhp.content <- readLines(bhp)

WWW and R now look at what bhp.content contains Dr. Breiman has also put up a data set at – open a url connection to this page and read the data what does it look like? what would we like to do with it?

WWW and R we would probably like to put it into a dataframe we would also like to know what the data means there is no way to do that with HTML except by convention and even then we have to parse the data writing parsers is complicated

WWW and XML The eXtensible Markup Language is intended to provide the missing functionality it comes with a number of additional tools XSL, XSLT, Xpointer, Xlink and Xpath XML is a simplified form of SGML

XML is becoming the standard for data transfer it is also becoming popular for tasks like remote procedure calls, for communicating between cooperative computing languages via SOAP

XML with XML you can define your own tags –,, and so on to give them meaning you use a Document Type Definition (or DTD) the DTD specifies which tags are valid, which attributes a tag can have and also the order (or nesting) requirements

XML in XML all open tags must have a corresponding closing tag, – must be followed by with any other tags that have been opened after closed before –this ensures proper nesting of the XML tags and makes it possible to parse the documents easily

XML an element consists of two tags, an opening tag and a closing tag – orange is an element any text between the tags is considered to be part of the element and is formatted according to the rules for that element

XML elements can have attributes – 24 notice that under these circumstances it is reasonably easy to extract all the heights from an XML document (and to get the units right!) attribute values must be contained inside of quotation marks, either double or single

XML a non-empty element must have both an opening and a closing tag an empty element might be there as a place holder or to provide its attribute – is an empty element, the closing tag is not required but we had to put a / before the closing >

XML tags must be nested correctly so the following is not allowed – that’s all folks since bar is the second tag it must be the first one to close an XML document that adheres to these rules is said to be well—formed

XML well—formed XML documents can be parsed using standard methods an second concept that can be applied to XML documents is validity an XML document is said to be valid if it conforms to its DTD XML documents can be well—formed but not valid

XML XML documents can be useful even when there is no DTD in other situations (eg my system for documenting clinical trials) the use of a DTD to ensure validity is necessary recently the DTD specification has been extended – the new method is called schema and is more flexible than a DTD

XML PI – processing instructions a PI tells an application to carry out a specific task a PI is not part of the rendered document but rather is an instruction to either the XML parser or to an application that uses the resultant document

XML PI’s are of the form: – An example of a PI: – this PI is included as the first line in almost all XML documents it indicates the versin and standalone=no indicates that a DTD is required

XML Namespaces: we need some means of limiting the scope of the definition of a tag suppose we have combined two DTD’s in a single XML document (this is both legal and useful) suppose that both DTD’s define a tag named leg except in one it stands for a person’s leg

XML and in the other the leg of a chair we wouldn’t want to mix those up namespaces can be used to ensure that tags from one DTD do not get confused with tags from another namespaces really don’t do anything though they are simply macro substitutions

XML namespaces should be unique it is common to use a URI (which need not exist) from here on tags can use and this is the equivalent of prepending the namespace string to the tag

XSL eXtensible Stylesheet Language this has not yet been completely formed (but should be soon) a style sheet describes how the XML document should be transformed to provide the rendered output you can have multiple style sheets for any XML document

XSL this means that you can have different versions of the document depending on whether the output is a Web page, a pdf document, input for another processing step and so on XSL (through XSLT) provides a means of rendering the data in an XML document

XPath an XML document has a tree structure there is a root node and below that there can be many more nodes for XSLT (and Xpointer) to work well they need to be able to reference different elements within the document they do this via XPath

XPath a simple example *[not(self::FOO:Bar)] is an Xpath statement that refers to all children of the current node whose name (the tag) is not FOO:Bar you can refer to parent nodes, children, grandparents and so on

XLL eXtensible Linking Language another part of the XML family are the mechanisms for linking different documents and portions of documents Xlink and Xpointer are the two mechanisms used to carry out the linking (similar to what goes on in a web page but with more control)

XLink a link is only an assertion of a relationship between pieces of a document (or documents) how that link is presented to the user depends on many things and can be quite different in different settings XML ID’s are used to provide unique labels for Xlink to link to

XPointer ID’s give you a flexible way to link to parts of the same document when you want to link to other documents then you need Xpointer the syntax is pretty complex

Literate Programming literate programming is an idea that originated with Don Knuth he wanted a system that allowed him to mix text and code in a more natural way so that documentation could be read easily by humans

Literate Programming to make the code runnable the code segments are extracted and placed in a separate file in the development version of R (and soon as a separate library) is a version of literate programming for the R language it is called Sweave

Sweave the idea is to produce a LaTeX like document that has a mix of LaTeX and R code this document is passed through an S engine and the code may be replaced by the output that it generates (including graphics)

Sweave this allows you to easily update reports when the data change it also allows you to document the code together with the report that the code is used to write see the Sweave User Manual that is also provided for today’s lecture

Sweave a second but important use for Sweave is to use it to document R packages using Sweave we can produce files that contain examples of analyses the Tangle facility allows us to extract the code segments into separate files and to run them

STangle Tangling is sort of the opposite of weaving it separates the components for R/S packages the text portion is generally not of interest the code portions allow us to ensure that the program is still functioning as we expect it allows us to put much more complex examples into our code

Sweave once this becomes a stable part of R I anticipate that most of you will find it a very useful device for doing homework assignments and data analyses