Lis512 lecture 4 XML: documents and records. up until now Relational databases can store information that is internal to an organization. But a lot of.

Slides:



Advertisements
Similar presentations
LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
Advertisements

Learning HTML. > Title of page This is my first homepage. Tells Browser This is an HTML page Basic Tags Tells Browser End of HTML page Header information.
Web Development & Design Foundations with XHTML
OMT II Mam Saima Gul. * Static web page * a web page with contents that remain fixed and unchanged once it has been created by the author Web server Client.
Lis512 lecture 4 the MARC format structure, leader, directory.
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
IST 535 Week 1 Class Orientation / Review of Web Basics.
16-Jun-15 HTTP Hypertext Transfer Protocol. 2 HTTP messages HTTP is the language that web clients and web servers use to talk to each other HTTP is largely.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
HTTP Hypertext Transfer Protocol. HTTP messages HTTP is the language that web clients and web servers use to talk to each other –HTTP is largely “under.
Intro to HTML Workshop. Welcome This slideshow presentation is designed to introduce you to the basics of HTML. It is the first of three HTML workshops.
The Information School of the University of Washington Oct university of washington1 Hypertext Markup Language INFO/CSE 100, Fall 2006 Fluency.
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
Introduction to XML: Yong Choi CSU Bakersfield.
HTTP Overview Vijayan Sugumaran School of Business Administration Oakland University.
Introduction to XML This material is based heavily on the tutorial by the same name at
Chapter 2 Introduction to HTML5 Internet & World Wide Web How to Program, 5/e Copyright © Pearson, Inc All Rights Reserved.
LIS512 lecture 12 conclusions Thomas Krichel
Copyright © 2003 Pearson Education, Inc. Slide 2-1 Created by Cheryl M. Hughes, Harvard University Extension School — Cambridge, MA The Web Wizard’s Guide.
Review HTML  What is HTML?  HTML is a language for describing web pages.  HTML stands for Hyper Text Markup Language  HTML is not a programming language,
XP New Perspectives on XML Tutorial 4 1 XML Schema Tutorial – Carey ISBN Working with Namespaces and Schemas.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University.
Chapter 9 Web Services Architecture and XML. Objectives By study in the chapter, you will be able to: Describe what is the goal of the Web services architecture.
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
DAT602 Database Application Development Lecture 14 HTML.
1 HTML and CGI Scripting CSC8304 – Computing Environments for Bioinformatics - Lecture 10.
Chapter 6 Text and Multimedia Languages and Properties
ULI101 – XHTML Basics (Part II) What is Markup Language? XHTML vs. HTML General XHTML Rules Block Level XHTML Tags XHTML Validation.
HyperText Transfer Protocol (HTTP).  HTTP is the protocol that supports communication between web browsers and web servers.  A “Web Server” is a HTTP.
CSC 2720 Building Web Applications Getting and Setting HTTP Headers (With PHP Examples)
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works.
1 Essential HTML coding By Fadi Safieddine (Week 2)
Using Html Basics, Text and Links. Objectives  Develop a web page using HTML codes according to specifications and verify that it works prior to submitting.
August Chapter 2 - Markup and Core Concepts Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology.
 2008 Pearson Education, Inc. All rights reserved Introduction to XHTML.
INTRODUCTION. What is HTML? HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is not a programming language,
Unit 2, cont. September 12 More HTML. Attributes Some tags are modifiable with attributes This changes the way a tag behaves Modifying a tag requires.
How do I use HTML and XML to present information?.
Extending HTML CPSC 120 Principles of Computer Science April 9, 2012.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
CA Professional Web Site Development Class 2: Anatomy of a Web Site and Web Page & Intro to HTML.
Chapter 9: Perl and CGI Programming CGI Programming Acknowledgement: Some materials are taken from Teach Yourself CGI Programming with PERL 5 in a Week.
XML Introduction. What is XML? XML stands for eXtensible Markup Language XML stands for eXtensible Markup Language XML is a markup language much like.
XML Design Goals 1.XML must be easily usable over the Internet 2.XML must support a wide variety of applications 3.XML must be compatible with SGML 4.It.
1 Web Services Web and Database Management System.
HTML Basic. What is HTML HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is not a programming language, it.
Operating Systems Lesson 12. HTTP vs HTML HTML: hypertext markup language ◦ Definitions of tags that are added to Web documents to control their appearance.
Web Server Design Week 7 Old Dominion University Department of Computer Science CS 495/595 Spring 2010 Martin Klein 2/24/10.
Introduction to XML XML – Extensible Markup Language.
HTML Basics. HTML Coding HTML Hypertext markup language The code used to create web pages.
5 th ed: Chapter 17 4 th ed: Chapter 21
Objective: To describe the evolution of the Internet and the Web. Explain the need for web standards. Describe universal design. Identify benefits of accessible.
Basic HTML Document Structure. Slide 2 Goals (XHTML HTML5) XHTML Separate document structure and content from document formatting HTML 5 Create a formal.
CHAPTER TWO HTML TAGS. 1.Basic HTML Tags 1.1 HTML: Hypertext Markup Language  HTML stands for Hypertext Markup Language.  It is the markup language.
CIS 228 The Internet Day 2, 9/1/11 Hypertext. The Course Instructor: Bowen Alpern Office hour: GI 137-I, 4-5pm Tu.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
HTML And the Internet. HTML and the Internet ► HTML: HyperText Markup Language  Language in which all pages on the web are written  Not Really a Programming.
Web Programming Week 1 Old Dominion University Department of Computer Science CS 418/518 Fall 2007 Michael L. Nelson 8/27/07.
Web Server Design Week 6 Old Dominion University Department of Computer Science CS 495/595 Spring 2006 Michael L. Nelson 2/13/06.
HTML CS 4640 Programming Languages for Web Applications
XML QUESTIONS AND ANSWERS
WEBSITE DESIGN Chp 1
Basic HTML Document Structure
What is XML?.
CIS 133 mashup Javascript, jQuery and XML
Attributes and Values Describing Entities.
HTML CS 4640 Programming Languages for Web Applications
Presentation transcript:

lis512 lecture 4 XML: documents and records

up until now Relational databases can store information that is internal to an organization. But a lot of information has to related to the outside world. This is when other considerations come in.

two basic types There are two basic types of outside communication tools records sets documents It's difficult to separate them precisely, but let's say that records are much more precisely defined.

general outside communication Traditional communication has mainly been achieved through issuing of documents. Example: a court issues a judgment on a case. Most documents contain character data. But they also contains something else. That's where markup comes in.

special outside communication In special cases, organizations make records available to other. These records have a format that allows others to process them to That format is quite rigid and usually purpose-built.

metadata Metadata is another form of records. The term metadata is usually defined as “data about data”. As such it is controversial what is metadata and what is data. As far as we are concerned metadata are records that are attached to documents.

metadata example mail If you send and receive , you will sometimes see what is knows an headers. These collection of fields are of the form attribute: value. Example on next slide

From Sun Jul 12 14:55: Date: Sun, 12 Jul :55: From: Thomas Krichel To: Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Envelope-to: Thomas Krichel Return-Path: Thomas Krichel User-Agent: Mutt/ ( ) Status: RO Content-Length: 5 Lines: 1

metadata example: http headers HTTP/ OK Date: Wed, 24 Feb :34:33 GMT Server: Apache/ (Debian) Last-Modified: Sun, 13 Dec :03:42 GMT ETag: "5f8271-f76-47a " Accept-Ranges: bytes Content-Length: 3958 Connection: close Content-Type: text/html

example id3v1 A fixed 128 byte format. – header 3 bytes "TAG" – title 30 bytes of the title – artist 30 bytes of the artist name – album 30 bytes of the album name – year 4 byte year – comment 30 bytes – zero-byte 1 If a track number is stored, this byte contains a binary 0. – track 1 The number of the track on the album, or 0. – genre 1 Index in a list of genres, or 255

MARC MARC is as important example of a record format used in by the library community Integrated Library Systems (ILSs) all import MARC records into relational database system export MARC records from relational database systems MARC records describe records from library catalogs.

MARC format The MARC format is very complicated. The basic structure is – Leader – Directory – Variable Control Fields – Variable Data Fields

MARC leader Described in der.html When they talk about character, they mean a byte.

MARC directory The MARC directory follows the leader. I am not sure what it’s purpose is. The general record structure is at ecstruc.html

MARC variable fields In MARC all field names are numbers. There are three digits to each fields. Numbers that start with 00 are for fields that are called control fields. Fields that start with 0 are numbers and control fields. Fields that do not start with 0 are the main field we study in cataloging.

field indicators Each field other than those starting with 00 can have zero, one or two field indicators. The field indicator says something additional about the field.

subfields Fields other than the one’s starting with 00 admit subfields. A subfield is identified by a letter a to z.

markup Markup is the information contained in a document that is not its contents. Markup mainly comes with two types of information information related to the structure information related to the appearance In good documents, structure and appearance are related.

if there where no markup If markup would not exist, it would be quite trivial to represent every document with a relational database structure. You simply have a table with character positions (first character position to last character position) and the character found there. But this would hardly correspond to our idea of a document.

structure The structure of a document is a bit difficult to define, but easy to understood by example. In a printed document, the sequence of pages defines one structure. But if the book has chapters and sections, they to define structures, and so do index page, title pages etc. A database tableu representation of this becomes messy.

structure The structure of a document is a bit difficult to define, but easy to understood by example. In a printed document, the sequence of pages defines one structure. But if the book has chapters and sections, they to define structures, and so do index page, title pages etc. A database table representation of this becomes messy.

appearance Appearance is usually used to communicate the structure of document in a way that aids a human to understand the structure. For example, look at this slide. We can conside that it is a document. Find way in which the appearance communicates the structure.

appearance Appearance covers things such as fonts used background and foreground colors positioning of structural elements If a document has some appearance and structure, it is tough to adapt it to a relational database structure.

XML XML is a syntax to encode information as documents. XML is not really a language since it has no vocabulary. You can use any vocabulary you like.

XML nodes XML is written in the form of nodes. I will only discuss three types of nodes here character data XML elements attributes to elements Character data as just that: characters.

XML elements If you write an element, write something of the form. contents here name is the name of the element and contents is the contents of the element. The contents can be character data and or other elements.

XML tags is the start tag of an element that is called name. is the end tag of an element that is called name. XML tags a syntactic feature of XML. They are not nodes.

empty elements If an element has no contents whatsoever, it can be written as or in the latter case it is an empty element

element examples Thomas Krichel Mr. Thomas Krichel hello world

child elements If an element is in the contents of another element, it is called a child element. When you write an XML document all elements much be children of one single element. That single element is the called the root element. The root element is the only element without a parent element.

attributes Attributes attach name=value pairs to element. These attribute value pairs appears written at the start stage

attribute examples Thomas Krichel Krichel, Thomas

more on attributes Attributes names and values are strings. Attribute values are surrounded by single or double quotes. Attributes names are separated from values by the = sign.

XML application examples HTML is the language used to encode a specific type of documents known as a web page. It has a vocabulary on element names and attribute names. HTML is written in XML syntax or a syntax that is close to it.

example HTML element The element creates an anchor. This is a part of the document that leads to another. Where it leads to is given by an attribute called href. Example Thomas Krichel

example HTML element The HTML element requests an image to be included in the web page Note that this element is empty.

MARC XML In order to increase the interoperability of MARC defined a mapping of the MARC format into the XML syntax. Not everybody thinks it is a good idea. e/2009/200909/1450.html A shamelessly copied example is at 512/external_doc/sandburg.xml

start of the example 01142cam a DLC s1993 caua j eng

end of the example Visual perception. Rand, Ted, ill.

comments on example In an XML document, there must be one element that all other elements are children of. In this case this is the element. The can contain many elements. In the example, there is just one. Find the features of MARC as set out in the description of MARC.