Advance Database S Week-5 Dr.Kwanchai Eurviriyanukul
Contents Week-5 XML Basics 5: 7 ธค. – DOM – XML DTD – XML Schema XML Query – Xpath 6: 14 ธค – Xquery7: 21 ธค – XSL 8: 24 ธค WebService 9: 4 มค – TH-E-GIF DataIntegration 10: 11 มค – Schema matching and mapping Parallel Database 11:18 มค GIS 12:
Contents Week-5 XML Basics 5: 7 ธค. – DOM – XML DTD – XML Schema : 14 ธค XML Query – Xpath 6: 21 ธค – Xquery7: 24 ธค – XSL 8: 24 ธค WebService 9: 4 มค – TH-E-GIF DataIntegration 10: 11 มค – Schema matching and mapping Parallel Database 11:18 มค GIS 12:
XML Basics: Content of Week 5 DOM (Reviews) XML DTD XML Schema Lab: DTD and Schema construction
Week5-Lab Construct and Validate your XML document using DTD and php.
XML Basics: Content of Week 5 DOM (Reviews) XML DTD XML Schema Lab: DTD and Schema construction
XML Document example
8 From Stanford: Well-Formed and Valid XML Well-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD. What is “Well-Formed XML ”?
9 From Stanford: Well-Formed and Valid XML Well-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD. What is “Well-Formed XML ”?
10 What are XML syntax rules? Well-Formed XML allows you to invent your own tags. Valid XML conforms to a certain DTD. What is “Well-Formed XML ”?
11 What are XML syntax rules? 1.? 2.? 3.? 4.? 5.?
12 What are XML syntax rules?
13 From Stanford: Well-Formed XML Start the document with a declaration, surrounded by. Normal declaration is: – “standalone” = “no DTD provided.” Balance of document is a root tag surrounding nested tags.
XML DTD
DTD Declaration
GUESS??? Internal or External???
DTD Declaration GUEST??? Internal or External???
DTD Declaration GUEST??? Internal or External???
DTD Declaration External Declaration note.dtd
Internal DTD Declaration
24 Refresh From Stanford: Well-Formed XML Start the document with a declaration, surrounded by. Normal declaration is: – “standalone” = “no DTD provided.” Balance of document is a root tag surrounding nested tags.
25 From Stanford: Example: (a) <!DOCTYPE BARS [ ]> Joe’s Bar Bud 2.50 Miller 3.00 … The DTD The document 1.? 2.?
26 From Stanford: Example: (a) <!DOCTYPE BARS [ ]> Joe’s Bar Bud 2.50 Miller 3.00 … The DTD The document 2.?
27 From Wiki: Example 1.? 2.? 3.? 4.?
28 From Wiki: Example 2.? 3.? 4.?
29 From Wiki: Example 3.? 4.?
30 From Wiki: Example 4.?
31 From Wiki: Example
DTD Declaration Exercise External Declaration note.dtd
33 From Stanford: Example: (b) Assume the BARS DTD is in file bar.dtd. Joe’s Bar Bud 2.50 Miller 3.00 … Get the DTD from the file bar.dtd 1.? External Declaration
34 From Stanford: Example: (b) Assume the BARS DTD is in file bar.dtd. Joe’s Bar Bud 2.50 Miller 3.00 … Get the DTD from the file bar.dtd External Declaration
35 From Wiki: Example 1.? 2.? 3.? External Declaration
36 From Wiki: Example 2.? 3.? External Declaration
37 From Wiki: Example 3.? External Declaration
38 From Wiki: Example
Why using DTD
DTD - XML Building Blocks
DTD - Elements
DTD - Attributes
DTD - Entities
DTD - PCDATA
DTD - CDATA
DTD - XML Building Blocks
DTD - Elements Webpage for more details
DTD - Elements
50 From Stanford: Element Descriptions Subtags must appear in order shown. A tag may be followed by a symbol to indicate its multiplicity. – * = zero or more. – + = one or more. – ? = zero or one. Symbol | can connect alternative sequences of tags.
DTD – Elements-Quiz
DTD – Elements-Quiz tutorial.dtd Valid XML document????
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => YES
DTD – Elements-Quiz tutorial.dtd Valid XML document????
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => NO
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => NO
DTD – Elements-Quiz tutorial.dtd Valid XML document????
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => YES
DTD – Elements-Quiz tutorial.dtd Valid XML document????
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => No
DTD – Elements-Quiz tutorial.dtd Valid XML document???? => No
DTD – Elements-Quiz tutorial.dtd Valid XML document????
DTD – Elements-Quiz tutorial.dtd Valid XML document????=>NO
DTD – Elements-Quiz tutorial.dtd Valid XML document????=>NO
65 From Stanford: Example: (a) <!DOCTYPE BARS [ ]> Joe’s Bar Bud 2.50 Miller 3.00 … The DTD The document 1.? 2.?
66 From Stanford: Example: (a) <!DOCTYPE BARS [ ]> Joe’s Bar Bud 2.50 Miller 3.00 … The DTD The document 1.?
DTD - XML Building Blocks
68 From Standford:DTD Structure [ ( )>... more elements... ]>
XML Basics: Content of Week 5 DOM (Reviews) XML DTD XML Schema Lab: DTD and Schema construction
Advance Database S Week-5 Dr.Kwanchai Eurviriyanukul
Week-1-Homework Marking criteria. 1.You have successfully created your database. (2.5 marks) 2.You have successfully populated your data into your big table. (2.5 marks) 3.You have successfully populated your data into your normalized tables (2.5 marks) 4.Your have demonstrated the enforcement of foreign key constraints for your database. (2.5 marks)
Homework-Week-2 1.Data Population (Normalization from last week) 1 2.HTML processing 4 3.Venn Diagram Creation 4
Lab for Week-3 1.Marking Week-2 Lab 2.Hand-Draw an XML tree structure for the given example in the class. (2.5) 3.Program-Draw an XML tree structure for an html data (from line ) 1. k-2/thailand.htm (2.5) k-2/thailand.htm 4.Change the Xpath expression of your previous code to … and observe what is different (5)
Contents Week-4 Create table using MySQL Workbench – Collation => Sorting => UTF-8_general/unicode – Storage Engine => MyISAM, InnoDB Populate Data: Select Data into table Join Optimization XML Basics – DOM
Lab for Week-4 1.Write “SQL statements” to populate data into the following schema 2.Create indexes to speed up the join of these tables. 3.Create a program to retrieve data from the web and populated into your tables – Province coords – Amphur coords
Lab for Week-4-Cont.
Week-4 Homework Explain
ID Structure province-district_name-sub_district_name- moo จังหวัด - อำเภอ - ตำบล - หมู่ => รหัสหมู่บ้าน เชียงใหม่ - จอมทอง - แม่สอย - หมู่ 13 => หมู่บ้าน ห้วยพัฒนา เชียงใหม่ - แม่แจ่ม - ช่างเคิ้ง - หมู่ 2 => หมู่บ้านต่อ เรือ
How about ID like ‘000001’? In INT format? => ‘1’ => We will loose ‘00000’ So char(6) will be a solution. From: refman-5.5-en.html-chapter/data-types.html#numeric-types As an extension to the standard MEDIUMINT MySQL supports the SQL standard integer types INTEGER (or INT) and SMALLINT. As an extension to the standard, MySQL also supports the integer types TINYINT, MEDIUMINT, and BIGINT.
Let’s analyze the possible ID for regions SQL Here to get => Villagedata
Let’s analyze the possible ID for regions Villagedata
How about? select DISTINCT left(village_id,1), DISTINCT region_name from villagedata;
How about? select DISTINCT left(village_id,1), DISTINCT region_name from villagedata; Error
Contents Week-4 Create table – Collation => Sorting => UTF-8_general/unicode – Storage Engine => MyISAM, InnoDB Populate Data: Select Data into table Join Optimization XML Basics – DOM
Now we need to construct region table Select name, collation, engine
What is collation???
Now we need to construct region table What is collation ?
Now we need to construct region table Select name, collation, engine
Collation utf8, a UTF-8 encoding of the Unicode character set using one to three bytes per character =>to collect and arrange in correct order the sheets of a document
Character Sets and Collations in General
Let’s have a look What are these?
Let’s have a look
Let’s have a look
Collation What collation should we use?
Collation
Collation => การเรียงลำดับตัวอักษร We use utf8_unicode_ci because it provides more correct Comparison for foreign characters, e.g. B => ‘ss’ in stead of ‘s’. Note it will be slower than utf8_general_ci for comparison.
Now we need to construct region table What is collation ?
Now we need to construct region table Select name, collation, engine
Database Engine Selection EB = 10E+18 engines.html
Database Engine Selection So we will use InnoDB because we want Transaction and Foreign key support. EB = 10E+18 engines.html
Create table region
Synchronize with DB CREATE TABLE IF NOT EXISTS `sml_test`.`region` ( `idregion` CHAR(1) NOT NULL, `nameregion` VARCHAR(45) NULL DEFAULT NULL, PRIMARY KEY (`idregion`) ) ENGINE = InnoDB DEFAULT CHARACTER SET = utf8 COLLATE = utf8_unicode_ci;
Synchronize with DB 1.CREATE TABLE IF NOT EXISTS `sml_test`.`region` ( 2. `idregion` CHAR(1) NOT NULL, 3. `nameregion` VARCHAR(45) NOT NULL, 4. PRIMARY KEY (`idregion`) ) 5.ENGINE = InnoDB 6.DEFAULT CHARACTER SET = utf8 7.COLLATE = utf8_unicode_ci; 8.DROP TABLE IF EXISTS `sml_test`.`tb_region` ; 9.DROP TABLE IF EXISTS `sml_test`.`tb_province` ;
Synchronize with DB 1.CREATE TABLE IF NOT EXISTS `sml_test`.`region` ( 2. `idregion` CHAR(1) NOT NULL, 3. `nameregion` VARCHAR(45) NOT NULL, 4. PRIMARY KEY (`idregion`) ) 5.ENGINE = InnoDB 6.DEFAULT CHARACTER SET = utf8 7.COLLATE = utf8_unicode_ci; 8.DROP TABLE IF EXISTS `sml_test`.`tb_region` ; 9.DROP TABLE IF EXISTS `sml_test`.`tb_province` ;
Contents Week-3 Constraints Create table using MySQL Workbench – Collation => Sorting => UTF-8_general/unicode – Storage Engine => MyISAM, InnoDB Select Data into table XML Basics – XML Quizs –
Let’s analyze the possible ID for regions
What is wrong?
Drop primary key of region Now please write down an SQL statement to insert into the region table.
Insert into region insert into select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata;
Insert into region insert into select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata;
Insert into region insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata;
Insert into region insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata;
What should be primary key? insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata; Single column is duplicated
What should be primary key? insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata; How about two columns for PK.
What should be primary key? insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata; How about two columns for PK.
What should be primary key? insert into region select distinct left(village_id,1) as idregion, region_name as nameregion from villagedata; How about two columns for PK.
What should be primary key? How about two columns for PK.
Synchronize database Alter table sml_test.region Add primary key(idregion, nameregion); Data still exists
Contents Week-4 Create table using MySQL Workbench – Collation => Sorting => UTF-8_general/unicode – Storage Engine => MyISAM, InnoDB Populate Data: Select Data into table Join Optimization XML Basics – DOM
Join Optimization 80,000 8,
Join Optimization 80,000 8, How about join them back together?
Join Optimization 80,000 8, How about join them back together?
Response Time-1 -- Second query select region_name, prov_name,amp_name from region r, province p, amphur a where r.region_id = p.region_id and p.prov_id = a.prov_id;
Response Time-2 select region_name, prov_name,amp_name, tam_name from region r, province p, amphur a, tambon t where r.region_id = p.region_id and p.prov_id = a.prov_id and a.amp_id = t.amp_id;
Response Time-3 select region_name, prov_name,amp_name, tam_name, vill_name from region r, province p, amphur a, tambon t, village v where r.region_id = p.region_id and p.prov_id = a.prov_id and a.amp_id = t.amp_id and t.tam_id = v.tam_id;
Join Optimization How about join them back together? 80,000 8,
phpMyAdmin
How to speed up a join?
Create Index
How to speed up a join? Create Index
How to speed up a join? Create Index
How to speed up a join? Create Index Which table should be considered? Which column should be considered? 80,000 8,
Create Index Village Table
Response Time-3 select region_name, prov_name,amp_name, tam_name, vill_name from region r, province p, amphur a, tambon t, village v where r.region_id = p.region_id and p.prov_id = a.prov_id and a.amp_id = t.amp_id and t.tam_id = v.tam_id;
What if We remove an index on village.tam_id? select region_name, prov_name,amp_name, tam_name, vill_name from region r, province p, amphur a, tambon t, village v where r.region_id = p.region_id and p.prov_id = a.prov_id and a.amp_id = t.amp_id and t.tam_id = v.tam_id;
What if We remove an index on village.tam_id? select region_name, prov_name,amp_name, tam_name, vill_name from region r, province p, amphur a, tambon t, village v where r.region_id = p.region_id and p.prov_id = a.prov_id and a.amp_id = t.amp_id and t.tam_id = v.tam_id; remove an index Has an index
Where else that we can add indexes?
Can we create index on every column?
Where else that we can add indexes? Can we create index on every column?
Contents Week-3 Create table using MySQL Workbench – Collation => Sorting => UTF-8_general/unicode – Storage Engine => MyISAM, InnoDB Populate Data: Select Data into table Join Optimization XML Basics – DOM
DOM
XML => Book
DOM Nodes
Node Tree
Node Type Property
Node List
DOM XPath::query Returns a DOMNodeList containing all nodes matching the given XPath expression
DOM $my_xpath_query = "/";
DOM $my_xpath_query =
DOM $my_xpath_query = != 0) and != '#')]";
DOM $my_xpath_query = != 0) and != '#')]";
Week-4-Home-Work
End of Week-4 Lecture