Structured-Document Processing Languages (3 cu), Spring 2002 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math

Slides:



Advertisements
Similar presentations
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Advertisements

XML: Extensible Markup Language
Processing of structured documents Spring 2003, Part 1 Helena Ahonen-Myka.
1 XML Data Management Course Outline and Organisation Werner Nutt.
Querying XML Documents and Data CBU Summer School (2 ECTS) Prof. Pekka Kilpeläinen Univ of Kuopio, Dept of Computer Science
Web Design! John Paxton Montana State University - Bozeman CCSC Northwestern Region Saturday, October 8 th, 2004.
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C Activities HTML: is the lingua franca for publishing on the Web XHTML: an XML application.
XML A brief introduction ---by Yongzhu Li. XML --- a brief introduction 2 CSI668 Topics in System Architecture SUNY Albany Computer Science Department.
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
The Semantic Web – introduction to the basic technology Week 2 - XML Lee McCluskey.
Overview of the MS Program Jan Prins. The Computer Science MS Objective – prepare students for advanced technical careers in computing or a related field.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
+ Connecting to the Web Week 7, Lecture A. + Midterm Basics Thursday February 28 during Class The lab Tuesday, February 26 is optional review Class on.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Prof. dr Slobodanka Đorđević-Kajan Dr Dragan Stojanović
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 XML Taken from Chapter 7.
XML Anisha K J Jerrin Thomas. Outline  Introduction  Structure of an XML Page  Well-formed & Valid XML Documents  DTD – Elements, Attributes, Entities.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
Introduction to XML cs3505. References –I got most of this presentation from this site –O’reilly tutorials.
CS 2104 Prog. Lang. Concepts Dr. Abhik Roychoudhury School of Computing Introduction.
SDPL 2002Notes 7: Apache Cocoon1 7 XML Web Site Architecture Example: Apache Cocoon, a Web publishing architecture based on XML technology
SDPL 2002Notes 8: XML Wrapping1 8 Translating Data to XML n How to translate existing data formats to XML? –(and why?) n XW (XML Wrapper) –an "XML wrapper.
Structured-Document Processing Languages Spring 2011 Course Review Repetitio mater studiorum est!
School of Computing and Management Sciences © Sheffield Hallam University To understand the Oracle XML notes you need to have an understanding of all these.
Lecture 2 : Understanding the Document Object Model (DOM) UFCFR Advanced Topics in Web Development II 2014/15 SHAPE Hong Kong.
©2003 Pearson Education, Inc., publishing as Longman Publishers. Study Skills Topic 8 Study Strategies PowerPoint by JoAnn Yaworski.
OPERATING SYSTEMS AND LANGUAGE TRANSLATORS CIS 2380 TERM 2 – LANGUAGE TRANSLATORS Lee McCluskey – 23/09/20151.
CSS 404 Internet Concepts. XP Objectives Developing a Web page and a Website Working with CSS (Cascading Style Sheets) Web Tables Web Forms Multimedia.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
Sheet 1XML Technology in E-Commerce 2001Lecture 7 XML Technology in E-Commerce Lecture 7 XSL Formatting Objects, Java Data Binding.
1 XML Data Management Course Outline and Organisation Werner Nutt.
UFCEUS-20-2 Web Programming Prakash Chatterjee Room 3P16 Lecture 1 : Introduction & Course Outline.
Structured-Document Processing Languages Spring 2005 Course Review Repetitio mater studiorum est!
Course Introduction Software Engineering
CST 229 Introduction to Grammars Dr. Sherry Yang Room 213 (503)
CS461: Principles and Internals of Database Systems Instructor: Ying Cai Department of Computer Science Iowa State University Office:
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
Lei Bu Preliminary Introduction to the Theory of Computation.
SDPL 2001Notes 4: Intro to Stylesheets1 4. Introduction to Stylesheets n Discussed recently: –Programmatic manipulation of (data-oriented) documents n.
1 Database Management for Electronic Commerce and EBusiness Walt Scacchi, Ph.D. GSM 274/FEMBA 274 Spring 2002.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
UFCEUS-20-2 Web Programming Lecture 1 Module Introduction & Outline.
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
Welcome to IB Biology Year 1. What it is all about To develop inquiring, knowledgeable, caring young people…that’s you! To create a better more caring.
1 Overview of XSL. 2 Outline We will use Roger Costello’s tutorial The purpose of this presentation is  To give a quick overview of XSL  To describe.
Jennifer Widom XML Data Introduction, Well-formed XML.
Structured-Document Processing Languages (3 cu), Spring 2004 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math
OFC291 Microsoft® Office Word XML (part 1 of 3): Introduction Martin Sawicki Lead Program Manager.
Introduction to Markup Languages January 31, 2002.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Structured-Document Processing Languages Spring 2007 Course Review Repetitio mater studiorum est!
SDPL 2002Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human-oriented.
SDPL 20064: Introduction to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –(APIs for) procedural manipulation of documents n Now.
Structured-Document Processing Languages (3 cu), Spring 2001 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math
Structured-Document Processing Languages Spring 2004 Course Review Repetitio mater studiorum est!
©Silberschatz, Korth and Sudarshan10.1Database System Concepts W3C - The World Wide Web Consortium W3C - The World Wide Web Consortium.
Structured-Document Processing Languages (3 cu/6 ECTS cp), Spring 2007 Pekka Kilpeläinen University of Kuopio Department of Computer Science
Sheet 1XML Technology in E-Commerce 2001Lecture 0 XML Technology in E-Commerce Klaas van den Berg & Ivan Kurtev 2000/2001 – trimester 3.
SDPL 2004Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human.
Structured-Document Processing Languages (5 cp), Spring 2011 Pekka Kilpeläinen University of Eastern Finland School of Computing
XSLT, XML Schema, and XPath Matt McClelland. Introduction XML Schema ▫Defines the content and structure of XML data. XSLT ▫Used to transform XML documents.
Course Overview Stephen M. Thebaut, Ph.D. University of Florida Software Engineering.
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
I Copyright © 2004, Oracle. All rights reserved. Introduction.
ECE/CS 352 Digital System Fundamentals1 ECE/CS 352 Digital Systems Fundamentals Spring 2001 Introduction Charles R. Kime.
XML Related Technologies
Structured-Document Processing Languages
Database Processing with XML
Introduction to Internet Programming
XML Data Introduction, Well-formed XML.
Presentation transcript:

Structured-Document Processing Languages (3 cu), Spring 2002 Pekka Kilpeläinen University of Kuopio Department of CS & Applied Math

SDPL 2002Notes 1: Introduction2 1 Introduction First: Overview and Arrangements What this course is about? 1.1 Structured Documents Review of basic notions

SDPL 2002Notes 1: Introduction3 Goals of the Course n To familiarize with the most important models and languages for –manipulating –representing –transforming and –querying structured documents (or XML) n “Core XML processing technology” –little about specific XML applications or commercial systems

SDPL 2002Notes 1: Introduction4 NOT an Exhaustive Survey n Bias in selecting course topics: –estimated usefulness/value »centrality (implying longer-lasting value) »maturity: Stable specifications? Existing implementations? –Lecturer up-to-date? n Emphasis on active formalisms (for describing processes on documents) instead of describing documents/data

SDPL 2002Notes 1: Introduction5 Motivation? n Practical relevance: “eBusiness” is HOT! n Academic interest on models of information processing XML Internet order invoice

SDPL 2002Notes 1: Introduction6 Preliminary Outline 1 Introduction Overview and Arrangements 1.1 Structured Documents 2 Document Instances and Grammars 2.1 Trees and their Grammars 2.2 Review of XML basics: DTDs, Namespaces, Schemas 3 Programmatic Manipulation of Structured Documents (XML APIs) 3.1 SAX 3.2 DOM

SDPL 2002Notes 1: Introduction7 Preliminary Outline (2) 4 Styling Structured Documents I 4.1 Essentials of Cascading Style Sheets 5 Transforming Structured Documents 5.1 Addressing: XPath 5.2 XSLT 6 Styling Structured Documents II: XSL 7 XML Web-Site Architectures 8 XML wrapping (or translating data to XML) 9 Querying Structured Documents 9.1 Region Algebra and sgrep 9.2 XML Query Languages

SDPL 2002Notes 1: Introduction8 Methodological Goals n Some central professional skills –consulting of technical specifications –experimenting with SW implementations n Ability to think…? –to find out relationships –to apply knowledge in new situations n ("Pidgin English" for scientific communication)

SDPL 2002Notes 1: Introduction9 Administration n An elective graduate-level (laudatur) special course –suitable for all specialisation lines (esp. CS/SWE) n 3 cu (  120 hours of work) n Lectures Mar 11 - May 8, Microteknia MT2 –Lecturer: n Assistant:

SDPL 2002Notes 1: Introduction10 Administration: Exercises n Exercises Mar 20 - May 15(?), MT2 –essential for familiarising with the subject –mainly normal homework assignments, solutions discussed in class –1 or 2 groups, depending on attendance n + a few (1-3) "mini-projects" »reading and summarising tasks? »hands-on experimentation? »to be handed-in to lecturer –credited like other exercises (scaled based on quality by a factor in [0, 1.5])

SDPL 2002Notes 1: Introduction11 Administration: Grading n Course examination on Wed, May 22, in the Great Lecture hall (SL) –minimum of 50% of exam points to pass the course Grade = (32*Exam/MaxExam + 12*HomeWork/MaxHomeWork - 8)/3 Grade = (32*Exam/MaxExam + 12*HomeWork/MaxHomeWork - 8)/3 n Opportunity to retake the exam –June 12 (  50% to pass, grade with/without homework credits)

SDPL 2002Notes 1: Introduction12 Material n No single textbook n Reports, articles n Course home page – –lecture notes, exercises, reference material, announcements, … n Recommended (but not required) text: Deitel, Deitel, Nieto, Lin & Sadhu: XML - How to Program. Prentice Hall, 2001.

SDPL 2002Notes 1: Introduction13 Background Check n Basic knowledge of structured documents and document standards –Course "Document standards"? –HTML? n Programming languages and concepts –OO programming, Java? –Unix/Linux \ Windows? n Formal language theory –Theory of Computation / "Ohjelmoinnin ja laskennan teoria"? –regular expressions, automata? –context-free grammars, parse trees?

SDPL 2002Notes 1: Introduction14 Course Expectations?

SDPL 2002Notes 1: Introduction Structured Documents n Document: –a structured representation of information on some medium (  message) –normally for a human reader »memos, manuals, articles, books, … –also application-to-application messages »EDI (electronic data interchange) –"prose-oriented XML" vs "data-oriented XML" –possibly non-permanent, dynamically generated –processable or conceivable as a unit »(a web page vs a web site)

SDPL 2002Notes 1: Introduction16 Text-Based Documents n We concentrate on textual or text-based documents –character data major constituent of information content –as opposed to, say multimedia documents n Next: Presentation vs Structure

SDPL 2002Notes 1: Introduction17 Presentation vs Structure n Presentation informs the human reader about the meaning of text and the role of its parts n Markup: indicating the presentation or the meaning of different parts of text –originally hand-written annotations for the typesetter –nowadays primarily codes embedded in digital documents

SDPL 2002Notes 1: Introduction18 Markup n Procedural markup –formatting commands (start boldface, produce an empty line, indent 5 mm,...) –proprietary word processor formats, nroff, TeX,... n Descriptive or generic markup –indicating the logical structure of text using chosen names –LaTeX: \begin{abstract}... \end{abstract} –HTML:.... –HTML:.... n Markup language –a fixed set of markup notations (e.g. nroff, TeX, HTML, SVG, …)

SDPL 2002Notes 1: Introduction19 Structured documents? Most liberally, any document is structured (punctuation, words, sentences, fields, …) but especially descriptively marked-up documents... especially if they adhere to a rigorous specification of structure.

SDPL 2002Notes 1: Introduction20 Structure in documents n Hierarchy or nesting is ubiquitous –chapters of books, warnings in maintenance manuals,... n Linear order essential in prose documents –less important in documents representing data objects n Hypertext and cross-references n We'll be mainly dealing with manipulation of hierarchical, or tree-like document structures Next: How these are modelled?