CSCI/CMPE 4341 Topic: Programming in Python Chapter 9: Python XML Processing Xiang Lian The University of Texas – Pan American Edinburg, TX 78539

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

CG0119 Web Database Systems Parsing XML: using SimpleXML & XSLT.
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
Website Design.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
An Introduction to XML Based on the W3C XML Recommendations.
Presentation 6: Introduction to XML and related technologies – for use with SOAP / WSDL = Web services.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
1 Extensible Markup Language: XML HTML: portable, widely supported protocol for describing how to format data XML: portable, widely supported protocol.
XML An introduction. xml XML like HTML is created from the Standard Generalized Markup Language, SGML.
1 Extensible Markup Language: XML HTML: widely supported protocol for formatting data XML: widely supported protocol for describing data XML is quickly.
 2001 Prentice Hall, Inc. All rights reserved. Chapter 5 – Creating Markup with XML Outline 5.1Introduction 5.2Introduction to XML Markup 5.3Parsers and.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
Tutorial 11 Creating XML Document
XML Primer. 2 History: SGML vs. HTML vs. XML SGML (1960) XML(1996) HTML(1990) XHTML(2000)
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Introduction to XML Rashmi Kukanur. XML XML stands for Extensible Markup Language XML was designed to carry data XML and HTML designed with different.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
 2008 Pearson Education, Inc. All rights reserved XML and RSS.
 2003 Prentice Hall, Inc. All rights reserved. Chapter 20 – Extensible Markup Language (XML) Outline 20.1 Introduction 20.2 Structuring Data 20.3 XML.
 2003 Prentice Hall, Inc. All rights reserved. 3 rd Edition Slide 1 Chapter 20 – Extensible Markup Language (XML) Outline 20.1 Introduction 20.2 Structuring.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Session IV Chapter 9 – XML Schemas
Tutorial 1: XML Creating an XML Document. 2 Introducing XML XML stands for Extensible Markup Language. A markup language specifies the structure and content.
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
 2004 Prentice Hall, Inc. All rights reserved. 1 Chapter 34 - Case Study: Active Server Pages and XML Outline 34.1 Introduction 34.2 Setup and Message.
How do I use HTML and XML to present information?.
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
XML Extensible Markup Language
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
CSCI/CMPE 4341 Topic: Programming in Python Review: Exam II Xiang Lian The University of Texas – Pan American Edinburg, TX 78539
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
 2002 Prentice Hall, Inc. All rights reserved. 1 Chapter 12 – XSL: Extensible Stylesheet Language Transformations (XSLT) Outline 12.1Introduction 12.2Setup.
XP 1 Creating an XML Document Developing an XML Document for the Jazz Warehouse XML Tutorial.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
XML Introduction. Markup Language A markup language must specify What markup is allowed What markup is required How markup is to be distinguished from.
XML Basics A brief introduction to XML in general 1XML Basics.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
Unit 10 Schema Data Processing. Key Concepts XML fundamentals XML document format Document declaration XML elements and attributes Parsing Reserved characters.
Internet & World Wide Web How to Program, 5/e. © by Pearson Education, Inc. All Rights Reserved.2.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Chapter 15: XML TP2543 Web Programming Mohammad Faidzul Nasrudin.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
Unit 8 XML Documents. Key Concepts XML fundamentals XML document format Document declaration XML elements and attributes Parsing Characters and white.
Presentation 6: Introduction to XML and related technologies – for use with SOAP / WSDL = Web services.
C Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to XML Standards.
Introduction to XML Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
VCE IT Theory Slideshows by Mark Kelly study design By Mark Kelly, vceit.com, Begin.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
HTML is about making documents. Simple Code for Simple Layout My Document This is an example HTML document First paragraph Second paragraph This is the.
Unit 4 Representing Web Data: XML
The XML Language.
Chapter 7 Representing Web Data: XML
More Sample XML By Sadia Anjum.
14 XML.
Allyson Falkner Spokane County ISD
Presentation transcript:

CSCI/CMPE 4341 Topic: Programming in Python Chapter 9: Python XML Processing Xiang Lian The University of Texas – Pan American Edinburg, TX

Objectives In this chapter, you will: – Understand XML – Become familiar with the types of markup languages created with XML – Learn to create XML markup programmatically – Use the Document Object Model (DOM) to manipulate XML documents – Explore ElementTree package to retrieve data from XML documents 2

Introduction XML developed by World Wide Consortium’s (W3C’s) XML Working Group (1996) XML portable, widely supported, open technology for describing data XML quickly becoming standard for data exchange between applications 3

XML Documents XML documents end with.xml extension XML marks up data using tags, which are names enclosed in angle brackets – elements – Elements: individual units of markup (i.e., everything included between a start tag and its corresponding end tag) – Nested elements form hierarchies – Root element contains all other document elements 4

 2002 Prentice Hall. All rights reserved. Outline 5 article.xml Simple XML December 21, 2001 John Doe XML is pretty easy. In this chapter, we present a wide variety of examples that use XML. Optional XML declaration includes version information parameter XML comments delimited by Root element contains all other document elements End tag has format

XML Document View XML documents – Any text editor Internet Explorer, Notepad, Visual Studio, etc. 6 Minus sign

 2002 Prentice Hall. All rights reserved. Outline 7 letter.xml Jane Doe 9 Box Any Ave. 11 Othertown 12 Otherstate John Doe Main St Anytown 23 Anystate Dear Sir: 30 Root element letter Child element contact Attribute (name-value pair)Empty elements do not contain character data

 2002 Prentice Hall. All rights reserved. Outline 8 letter.xml 31 It is our privilege to inform you about our new 32 database managed with XML. This 33 new system allows you to reduce the load on 34 your inventory list server by having the client machine 35 perform the work of sorting and filtering the data Please visit our Web site for availability 39 and pricing Sincerely Ms. Doe 45

9 XML Namespaces Provided for unique identification of XML elements Namespace prefixes identify namespace to which an element belongs Topic: Programming in Python

 2002 Prentice Hall. All rights reserved. Outline 10 namespace.xml <text:directory xmlns:text = " 7 xmlns:image = " A book list A funny picture Attribute xmlns creates namespace prefix Namespace prefix bound to an URI Uses prefix text to describe element file

 2002 Prentice Hall. All rights reserved. Outline 11 defaultnamespace.xml <directory xmlns = " 7 xmlns:image = " A book list A funny picture Creates default namespace by binding URI to attribute xmlns without prefixElement without prefix defined in default namespace

12 Document Object Model (DOM) DOM parser retrieves data from XML document Hierarchical tree structure called a DOM tree – Each component of an XML document represented as a tree node – Parent nodes contain child nodes – Sibling nodes have same parent – Single root (or document) node contains all other document nodes

13 Example of Document Object Model (DOM) article title author summary contents lastName firstName date

Processing XML in Python Python packages for XML support – 4DOM and xml.sax – Generating XML dynamically similar to generating HTML – Python scripts can use print statements or XSLT to output XML 14

 2002 Prentice Hall. All rights reserved. Outline 15 names.txt O'Black, John Green, Sue Red, Bob Blue, Mary White, Mike Brown, Jane Gray, Bill Fig Text file names.txt used in Fig

 2002 Prentice Hall. All rights reserved. Outline 16 fig16_02.py #!c:\Python\python.exe # Fig. 16.2: fig16_02.py # Marking up a text file's data as XML. import sys # write XML declaration and processing instruction print (""" """) # open data file try: file = open( "names.txt", "r" ) except IOError: sys.exit( "Error opening file" ) print (" ") # write root element # list of tuples: ( special character, entity reference ) replaceList = [ ( "&", "&" ), ( "<", "<" ), ( ">", ">" ), ( '"', """ ), ( "'", "&apos;" ) ] # replace special characters with entity references for currentLine in file.readlines(): for oldValue, newValue in replaceList: currentLine = currentLine.replace( oldValue, newValue ) Print XML declarationOpen text file if it existsPrint root elementList of special characters and their entity referencesReplace special characters with entity references

 2002 Prentice Hall. All rights reserved. Outline 17 fig16_02.py # extract lastname and firstname last, first = currentLine.split( ", " ) first = first.strip() # remove carriage return # write contact element print (""" %s """ % ( last, first )) file.close() print (" ") Extract first and last nameRemove carriage returnPrint contact elementPrint root’s closing tag

18 XML Processing Packages Third-party package 4DOM, included with package PyXML, complies with W3C’s DOM Recommendation xml.sax, included with Python, contains classes and functions for SAX-based parsing 4XSLT, located in package 4Suite, contains an XSLT processor for transforming XML documents into other text-based formats import xml.etree.ElementTree

 2002 Prentice Hall. All rights reserved. Outline 19 article2.xml Simple XML December 19, 2001 Jane Doe XML is easy. Once you have mastered XHTML, XML is learned easily. Remember that XML is not for displaying information but for managing information. XML document used by fig16_04.py

 2002 Prentice Hall. All rights reserved. Outline 20 fig16_04.py # Fig. 16.4: fig16_04.py # Using 4DOM to traverse an XML Document. import sys import xml.etree.ElementTree as etree # open XML file try: tree = etree.parse("article2.xml") except IOError: sys.exit( "Error opening file" ) # get root element rootElement = tree.getroot() print ("Here is the root element of the document: %s" % \ rootElement.tag) # traverse all child nodes of root element print ("The following are its child elements:" ) for node in rootElement: print (node)

 2002 Prentice Hall. All rights reserved. Outline fig16_04.py # get first child node of root element child = rootElement[0] print ("\nThe first child of root element is:", child.tag) print ("whose next sibling is:" ) # get next sibling of first child sibling = rootElement[1] print (sibling.tag) print ('Value of "%s" is:' % sibling.tag, end="") print (sibling.text)

22