Structured-Document Processing Languages (3 cu), Spring 2004 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math

Slides:



Advertisements
Similar presentations
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Advertisements

XML: Extensible Markup Language
Processing of structured documents Spring 2003, Part 1 Helena Ahonen-Myka.
1 XML Data Management Course Outline and Organisation Werner Nutt.
Querying XML Documents and Data CBU Summer School (2 ECTS) Prof. Pekka Kilpeläinen Univ of Kuopio, Dept of Computer Science
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C Activities HTML: is the lingua franca for publishing on the Web XHTML: an XML application.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
The Semantic Web – introduction to the basic technology Week 2 - XML Lee McCluskey.
W3C Activities HTML: is the lingua franca for publishing on the Web XHTML: an XML application with a clean migration path from HTML 4.01 CSS: Style sheets.
Overview of the MS Program Jan Prins. The Computer Science MS Objective – prepare students for advanced technical careers in computing or a related field.
4/20/2017.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Sheet 1XML Technology in E-Commerce 2001Lecture 6 XML Technology in E-Commerce Lecture 6 XPointer, XSLT.
Prof. dr Slobodanka Đorđević-Kajan Dr Dragan Stojanović
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
Scientific Markup Languages Birds of a Feather A 10-Minute Introduction to XML Timothy W. Cole Mathematics Librarian & Professor of.
CS 2104 Prog. Lang. Concepts Dr. Abhik Roychoudhury School of Computing Introduction.
SDPL 2002Notes 8: XML Wrapping1 8 Translating Data to XML n How to translate existing data formats to XML? –(and why?) n XW (XML Wrapper) –an "XML wrapper.
Structured-Document Processing Languages Spring 2011 Course Review Repetitio mater studiorum est!
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
Extensible Markup and Beyond
OPERATING SYSTEMS AND LANGUAGE TRANSLATORS CIS 2380 TERM 2 – LANGUAGE TRANSLATORS Lee McCluskey – 23/09/20151.
CSS 404 Internet Concepts. XP Objectives Developing a Web page and a Website Working with CSS (Cascading Style Sheets) Web Tables Web Forms Multimedia.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
Sheet 1XML Technology in E-Commerce 2001Lecture 7 XML Technology in E-Commerce Lecture 7 XSL Formatting Objects, Java Data Binding.
1 XML Data Management Course Outline and Organisation Werner Nutt.
Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!
Structured-Document Processing Languages (3 cu), Spring 2002 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math
Advanced Java Session 9 New York University School of Continuing and Professional Studies.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Lei Bu Preliminary Introduction to the Theory of Computation.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
SDPL 2001Notes 4: Intro to Stylesheets1 4. Introduction to Stylesheets n Discussed recently: –Programmatic manipulation of (data-oriented) documents n.
1 Database Management for Electronic Commerce and EBusiness Walt Scacchi, Ph.D. GSM 274/FEMBA 274 Spring 2002.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
OFC291 Microsoft® Office Word XML (part 1 of 3): Introduction Martin Sawicki Lead Program Manager.
Introduction to Markup Languages January 31, 2002.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Structured-Document Processing Languages Spring 2007 Course Review Repetitio mater studiorum est!
SDPL 2002Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human-oriented.
SDPL 20064: Introduction to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –(APIs for) procedural manipulation of documents n Now.
Structured-Document Processing Languages (3 cu), Spring 2001 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math
Structured-Document Processing Languages Spring 2004 Course Review Repetitio mater studiorum est!
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
Martin Kruliš by Martin Kruliš (v1.1)1.
Structured-Document Processing Languages (3 cu/6 ECTS cp), Spring 2007 Pekka Kilpeläinen University of Kuopio Department of Computer Science
Sheet 1XML Technology in E-Commerce 2001Lecture 0 XML Technology in E-Commerce Klaas van den Berg & Ivan Kurtev 2000/2001 – trimester 3.
SDPL 2004Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human.
Structured-Document Processing Languages (5 cp), Spring 2011 Pekka Kilpeläinen University of Eastern Finland School of Computing
XML Extensible Markup Language
XSLT, XML Schema, and XPath Matt McClelland. Introduction XML Schema ▫Defines the content and structure of XML data. XSLT ▫Used to transform XML documents.
Course Overview Stephen M. Thebaut, Ph.D. University of Florida Software Engineering.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
XML Related Technologies
Structured-Document Processing Languages
Session I - Introduction
Session I - Introduction
Database Processing with XML
Prepared for Md. Zakir Hossain Lecturer, CSE, DUET Prepared by Miton Chandra Datta
Introduction to Internet Programming
XML Data Introduction, Well-formed XML.
Presentation transcript:

Structured-Document Processing Languages (3 cu), Spring 2004 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math

SDPL 2004Notes 1: Introduction2 1 Introduction First: Overview and Arrangements What this course is about? 1.1 Structured Documents Review of basic notions

SDPL 2004Notes 1: Introduction3 Goals of the Course n To get familiar with the most important models and languages for –manipulating –representing –transforming and –querying structured documents (or XML) n “Generic XML processing technology” –very little about specific XML applications or commercial systems

SDPL 2004Notes 1: Introduction4 NOT an Exhaustive Survey n Bias in selecting course topics: –estimated usefulness/value »centrality (implying longer-lasting value) »maturity: Stable specifications? Existing implementations? –Lecturer up-to-date? n Emphasis on processing data in the form of documents, rather than describing it

SDPL 2004Notes 1: Introduction5 Motivation? n Practical relevance: “eBusiness” is HOT! n Academic interest in models of information processing XML Internet order invoice

SDPL 2004Notes 1: Introduction6 Preliminary Outline 1 Introduction Overview and Arrangements 1.1 Structured Documents 2 Document Instances and Grammars 2.1 Trees and their Grammars 2.2 Review of XML basics: DTDs, Namespaces, Schemas 3 Programmatic Manipulation of Structured Documents (XML APIs) 3.1 SAX 3.2 DOM; 3.3 JAXP

SDPL 2004Notes 1: Introduction7 Preliminary Outline (2) 4 Styling Structured Documents I 4.1 Essentials of Cascading Style Sheets 5 Transforming Structured Documents 5.1 Addressing: XPath 5.2 XSLT 6 Styling Structured Documents II: XSL 7 XML wrapping (or translating data to XML) 8 Querying Structured Documents - W3C XML Query Language XQuery

SDPL 2004Notes 1: Introduction8 Methodological Goals n Some central professional skills –consulting of technical specifications –experimenting with SW implementations n Ability to think…? –to find out relationships –to apply knowledge in new situations n ("Pidgin English" for scientific communication)

SDPL 2004Notes 1: Introduction9 Administration n An elective graduate-level (laudatur) special course –suitable for all specialisation lines (esp. CS/SWE) n 3 cu (  120 hours of work) n Lectures March 9 – May 6, MT2/E26–27 –Lecturer: n Assistant:

SDPL 2004Notes 1: Introduction10 Administration: Exercises n Exercises March 24 – May 12, MT2/E26–27 –essential for familiarizing with the technology –mainly normal homework assignments, some hands-on practice; Solutions discussed in class n + a "mini-project" »programming/modifying a document processing application (XML/Java/DOM/JAXP/XSLT) »individually or in small groups »to be handed-in to lecturer –credited like other exercises (grading based on quality by a factor in [0, 1.5])

SDPL 2004Notes 1: Introduction11 Administration: Grading n Course exam on Tuesday, May 18, in SL –minimum of 50% of exam points to pass the course Grade = (12*Exam/MaxExam + 4*HomeWork/MaxHomeWork - 4) Grade = (12*Exam/MaxExam + 4*HomeWork/MaxHomeWork - 4) n Opportunity to retake the exam –June 3 (again  50% to pass; grade with/without homework credits, whichever is better)

SDPL 2004Notes 1: Introduction12 Material n No single textbook n Reports, articles n Course home page – –lecture notes, exercises, reference material, announcements, … n Possible background text: Deitel, Deitel, Nieto, Lin & Sadhu: XML - How to Program. Prentice Hall, 2001.

SDPL 2004Notes 1: Introduction13 Background Check n Basic knowledge of structured documents and document standards –Course ”Introduction to Document standards"? –HTML? n Programming languages and concepts –Java? OO programming? –Unix/Linux \ Windows? n Formal language theory –Theory of Computation / "Ohjelmoinnin ja laskennan teoria"? –regular expressions, automata? –context-free grammars, parse trees?

SDPL 2004Notes 1: Introduction14 Course Expectations?

SDPL 2004Notes 1: Introduction Structured Documents n Document: –a structured representation of information on some medium (  message) –normally for a human reader »memos, manuals, articles, books, … –also application-to-application messages »EDI (electronic data interchange) –"prose-oriented XML" vs "data-oriented XML" –possibly non-permanent, dynamically generated –processable or conceivable as a unit »(a web page vs a web site)

SDPL 2004Notes 1: Introduction16 Text-Based Documents n We concentrate on textual or text-based documents –character data major constituent of information content –as opposed to, say multimedia documents n Next: Presentation vs Structure

SDPL 2004Notes 1: Introduction17 Presentation vs Structure n Presentation informs the human reader about the meaning of text and the role of its parts n Markup (merkkaus): indicating the presentation or the meaning of different parts of text –originally hand-written annotations for the typesetter –nowadays primarily codes embedded in digital documents

SDPL 2004Notes 1: Introduction18 Markup n Procedural markup –formatting commands (start boldface, produce an empty line, indent 5 mm,...) –proprietary word processor formats, nroff, TeX,... n Descriptive or generic markup –indicating the logical structure of text using chosen names –LaTeX: \begin{abstract}... \end{abstract} –HTML:.... –HTML:.... n Markup language (merkkauskieli) –a fixed set of markup notations (e.g. nroff, TeX, HTML, SVG, …)

SDPL 2004Notes 1: Introduction19 Structured Documents? Most liberally, any document is structured (punctuation, words, sentences, fields, …) but especially descriptively marked-up documents... especially if they adhere to a rigorous specification of structure

SDPL 2004Notes 1: Introduction20 Structure in Documents n Hierarchy or nesting is ubiquitous –chapters of books, warnings in maintenance manuals,... n Linear order essential in prose documents –less important in documents representing data objects n Hypertext and cross-references n We'll be mainly dealing with manipulation of hierarchical, or tree-like document structures Next: How these are modelled?