Introduction XML stands for eXtensible Markup Language. Designed to transport and store data; not to display it XML is similar to HTML, but tags are not.

Slides:



Advertisements
Similar presentations
PHP II Interacting with Database Data. The whole idea of a database-driven website is to enable the content of the site to reside in a database, and to.
Advertisements

The eXtensible Markup Language (XML) An Applied Tutorial Kevin Thomas.
What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data XML.
CS252: Systems Programming Ninghui Li Program Interview Questions.
1 RDF Tutorial. C. Abela RDF Tutorial2 What is RDF? RDF stands for Resource Description Framework It is used for describing resources on the web Makes.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
RDF Tutorial.
Chapter 10 Database Applications Copyright © 2011 by The McGraw-Hill Companies, Inc. All Rights Reserved. McGraw-Hill.
XML XML What XML is and what it means to me as a Computer Scientist By: Derek Edwards CS 376 March 10, 2003.
HTML and XHTML Controlling the Display Of Web Content.
Dr. Alexandra I. Cristea RDF.
B.Sc. Multimedia ComputingMedia Technologies Database Technologies.
Attribute databases. GIS Definition Diagram Output Query Results.
CS 106 Introduction to Computer Science I 10 / 15 / 2007 Instructor: Michael Eckmann.
XSLT transformations Or how to get your XML to become HMTL.
Introduction to XML Rashmi Kukanur. XML XML stands for Extensible Markup Language XML was designed to carry data XML and HTML designed with different.
CS 106 Introduction to Computer Science I 10 / 16 / 2006 Instructor: Michael Eckmann.
CS 106 Introduction to Computer Science I 03 / 17 / 2008 Instructor: Michael Eckmann.
Developing a Basic Web Page Posting Files on UMBC
Introduce of XML Xiaoling Song CS157A. What is XML? XML stands for EXtensible Markup Language XML stands for EXtensible Markup Language XML is a markup.
XSLT XSLT: eXtensible Stylesheet Language for Transformations - a language for transforming XML documents into any text- based format (e.g. HTML, plain.
Some Basic Database Terminology
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
IS432 Semi-Structured Data Lecture 5: XSLT Dr. Gamal Al-Shorbagy.
XSLT transformations Or how to get your XML to become HMTL.
CSC 330 E-Commerce Teacher Ahmed Mumtaz Mustehsan Ahmed Mumtaz Mustehsan GM-IT CIIT Islamabad GM-IT CIIT Islamabad CIIT Virtual Campus, CIIT COMSATS Institute.
XML Extensible Markup Language. Markup Languages u What does this number (100) mean? –Actually, it’s just a string of characters! –A markup language can.
10-1 aslkjdhfalskhjfgalsdkfhalskdhjfglaskdhjflaskdhjfglaksjdhflakshflaksdhjfglaksjhflaksjhf.
Form Validator “Hasta La Vista SQL Injection”. Their Job, Our Job, It’s Job Chris Anley mentions Four Best- Practices to Avoid SQL Injection Chris Anley.
Introduction to XML 1. XML XML started out as a standard data exchange format for the Web Yet, it has quickly become the fundamental instrument in the.
Miscellaneous Excel Combining Excel and Access. – Importing, exporting and linking Parsing and manipulating data. 1.
PHP meets MySQL.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
What is XML (Extensible Markup Language)? XML is basically a better comma delimited file. Example: Your client asks you to write a new reporting system.
XP New Perspectives on Integrating Microsoft Office XP Tutorial 2 1 Integrating Microsoft Office XP Tutorial 2 – Integrating Word, Excel, and Access.
1 T RANSFORMING XSLT 2.0 T O XQ UERY 1.0 Advanced Database SystemsCOSC282 G OWRI S HANKAR D ARA T EAM M EMBERS D ARREL M AZZARI A LBIN L AGA A DITYA T.
XSLT Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
CIS 375—Web App Dev II XSL. 2 XSL Introduction XSL stands for _____________________________. XSL is the language used for manipulating and displaying.
Presentation Topic: XML and ASP Presented by Yanzhi Zhang.
© 2008 The McGraw-Hill Companies, Inc. All rights reserved. ACCESS 2007 M I C R O S O F T ® THE PROFESSIONAL APPROACH S E R I E S Lesson 13 – Advanced.
XP New Perspectives on Microsoft Office Access 2003, Second Edition- Tutorial 2 1 Microsoft Office Access 2003 Tutorial 2 – Creating And Maintaining A.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
CIS 275—Web App Dev I XML. 2 Introduction to XMLXML XML stands for ________________________. HTML was designed to display data. XML was designed to _________.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
RDF – RESOURCE DESCRIPTION FRAMEWORK Antonio Bucchiarone FBK-IRST Trento, Italy 01Dicembre 2009.
1 Structured Query Language (SQL). 2 Contents SQL – I SQL – II SQL – III SQL – IV.
Waqas Anwar Next SlidePrevious Slide. Waqas Anwar Next SlidePrevious Slide XML XML stands for EXtensible Markup Language.
VB and C# Programming Basics. Overview Basic operations String processing Date processing Control structures Functions and subroutines.
XP 1 New Perspectives on XML Binding XML Data with Internet Explorer.
Ali Alshowaish. What is HTML? HTML stands for Hyper Text Markup Language Specifically created to make World Wide Web pages Web authoring software language.
1 Introduction  Extensible Markup Language (XML) –Uses tags to describe the structure of a document –Simplifies the process of sharing information –Extensible.
CIS 375—Web App Dev II ASP.NET 8 More Binding. 2 The Repeater ControlThe Repeater Control 1 The Repeater control is used to display a repeated list of.
SQL Jan 20,2014. DBMS Stores data as records, tables etc. Accepts data and stores that data for later use Uses query languages for searching, sorting,
XML Presented by Kushan Athukorala. 2 Agenda XML Overview Entity References Elements vs. Atributes XML Validation DTD XML Schema Linking XML and CSS XSLT.
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
Creating a simple database This shows you how to set up a database using PHPMyAdmin (installed with WAMP)
XML, SCHEMAS, DTD The primer Brendan Knight. XML  XML stands for eXtensible Markup Language.  XML is designed to transport and store data.  Contains.
E-Publishing Webinar This is Your Brain on XML: Why Publishers Should Care About the XML Revolution Digital publications made easy.
IS444: Modern software development tools Dr. Azeddine Chikh.
Notes Test #2 will be held one week from this Thursday Check to see if you have a Vision account –Launch Netscape –Point & Click to location and type vision.
Access Module Implementing a Database with Microsoft Access A Great Module on Your CD.
XML Extensible Markup Language
Connecting to External Data. Financial data can be obtained from a number of different data sources.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
ECS – Storyboarding and Introduction to Web Design
Creating an XML Document
Intro to PHP & Variables
Presentation transcript:

Introduction XML stands for eXtensible Markup Language. Designed to transport and store data; not to display it XML is similar to HTML, but tags are not predefined. Tags are defined by users. XML is a W3C recommendation. The main idea is to compress well formed xml files, for an application, which are generated from database queries.

Xml file structures Data … Data … xml file head main xml ELEMENT xml ELEMENT by query’s row Xml ELEMENT by query’s col

Algorithm The algorithm takes advantages of the well defined structure of the xml files. Also, the frequency that row’s columns could have. This is the big deal of the algorithm! Some compression strategies, similar to Static Dictionary, where xml tags, and “DataKeys” are replace by unused Ascii characters.

Description Compression Algorithm The file is processed in two (2) phases. Phase One means figuring out xml tags, Ascii characters available, and DataKeys. DataKey are sorting by the following rule: Length(DataKey) * frequency – (Length(DataKey) + frequency). Any DataKey over availability is discarded. Example: Key len= 20, frequency= 10; means 30 instead of 200= 170 Key len= 15, frequency= 10; means 25 instead of 150= 125 Key len= 30, frequency= 5; means 35 instead of 150= 115

Description Compression Algorithm Phase II means reading again the xml file in order to create a new file with a header - built from the information taken from Phase I, and its detail is shown later-, to reconstruct the xml file, and replacing Tags/DataKeys by available Ascii Characters.

Description Compression Algorithm Rules to replace Tags/DataKeys –Main Tag is skipped –Row Tag, an Ascii char is assigned. –Column Tag, an Ascii char is assigned. –If Column Data is a DataKey If Ascii char is assigned, just Assigned Ascii Else Assigned Column Char + Column Data –Else Assigned Column Char + Column Data

Description Decompression Algorithm Read Header file –First four (4) Characters mean 1.Number of BitWise characters. -used Ascii chars. 2.First used Ascii char. 3.Number of Element tag. 4.Number of Data Keys set. –According to Char 4, reads pair Col/Num –According to Char 1, reads Bitwise –According to Char 3, reads Element String –According to Total Num from pairs, reads DK –Reads the rest of file replacing assigned Ascii

Application Syntax xmlzip [-c filename.xml] [-k column _1 … column_n]] | [-d filename.xzp] Where -c: Compressing -k: Column numbers to be Data Keys -d: Decompressing

HEADER Converted File NUMBITWISE STARTASCII NUMELEMENT DATAKEYNUM COLUMNNUMB SUBDATAKEY BITWISECHR. BITWISECHR ELEMENTSTR. ELEMENTSTR DATAKEYSTR. DATAKEYSTR NULLCHARAC CATALOG CD TITLE ARTIST COUNTRY COMPANY PRICE YEAR USA UK Colombia 0 … We can notice Header Length is proportional to characters found in XML file, XML file Elements, and Datakey founds in XML file: NUMELEMENT ∑ SUBDATAKEY H = 4 + DATAKEYNUM*2 + NUMBITWISE + ∑ [length(ELEMENTSTRi)+1] + ∑ [DATAKEYSTRj)+1] + 1 i=1 j=1 In this case, the file HEADER is: H= * = 87

Empire Burlesque Bob Dylan USA Columbia Hide your heart Bonnie Tyler UK CBS Records Thriller Michael Jackson USA Columbia Love Songs Bee Gee UK Records Oral Fixation Shaquira Colombia Epic &Empire Burlesque !Bob Dylan % *Columbia $10.90 &Hide your heart !Bonnie Tyler ~ *CBS Records $9.90 &Thriller !Michael Jackson % *Columbia $11.90 &Love Songs !Bee Gee ~ *Records $12.00 &Oral Fixation !Shaquira ^ *Epic $18.70 #2006

Next The next step is to make the algorithm generic. I mean the algorithm feature of taking column frequency advantage. It can be exploited by Tag’s name instead of column number. I didn’t try to make it available because of time, but it’s a good point in order to avoid any conflict due to column order. Also, it’s necessary the implementation of xml Attribute recognition. It’s almost done so far, but I didn’t keep going because of time constraint. It would be a good implementation that the user could say -by parameters- which specific Attribute is going to be taken into account. A good example is that Element’s Tags, and Attributes Tags could share the same name, even thought they are different data type. Finally, but not least, complete the implementation of a modified PPM algorithm. The first task would be adding to the HEADER those DataKey over the available Ascii chars holding the condition: Length(DataKey) > Largest Context, and frequency >1 –at least. In order to add them to a “temporary” count array, where the size of the DataKey no matter.