1 4 Approaches to Structuring Lists February 22, 2009.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 Knowledge and reasoning – second part Knowledge representation Logic and representation Propositional (Boolean) logic Normal forms Inference in propositional.
AP STUDY SESSION 2.
1
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
1 Roger L. Costello 16 June 2010 XQuery
1 How to Specify Validation Information Roger L. Costello 27 December, 2008.
1 Rules of Thumb for Creating XML Vocabularies for Workflow Applications February 1, 2009.
Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Copyright © 2003 Pearson Education, Inc. Slide 3-1 Created by Cheryl M. Hughes The Web Wizards Guide to XML by Cheryl M. Hughes.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Objectives: Generate and describe sequences. Vocabulary:
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Conversion Problems 3.3.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Create an Application Title 1D - Dislocated Worker Chapter 9.
Create an Application Title 1A - Adult Chapter 3.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Knowledge Extraction from Technical Documents Knowledge Extraction from Technical Documents *With first class-support for Feature Modeling Rehan Rauf,
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
PP Test Review Sections 6-1 to 6-6
EU market situation for eggs and poultry Management Committee 20 October 2011.
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Copyright © 2013, 2009, 2005 Pearson Education, Inc.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
2 |SharePoint Saturday New York City
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
VOORBLAD.
How to convert a left linear grammar to a right linear grammar
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Reference Manual Roger L. Costello XML Technologies Course.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
Jim Haywood (Product Manager for Statutory Returns) Adopted from Care - Spring Release 2014.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Note to the teacher: Was 28. A. to B. you C. said D. on Note to the teacher: Make this slide correct answer be C and sound to be “said”. to said you on.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Chapter 2 Entity-Relationship Data Modeling: Tools and Techniques
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
Presentation transcript:

1 4 Approaches to Structuring Lists February 22, 2009

2 Lists are everywhere A list of countries A list of religions A list of weights A list of students A list of days of the week A list of planets

3 The purpose of this document is to answer these questions What are the different approaches to structure lists? What are the pros and cons of each approach? Is there a way to structure lists to maximize their utility and minimize their overhead?

4 Lists should be usable for multiple purposes

5 Example We will use a country list to illustrate the four approaches.

6 Some ways we might use a country list Use it as values in an XForms pick list Merge it with other data to create a document that contains, for each country, sales figures (or death rates, births, political leadership, religions, etc) Use it to validate an element's content country list _______ validate

7 Approach #1 Express lists using the XML Schema vocabulary

8 <xs:schema xmlns:xs=" targetNamespace=" xmlns=" elementFormDefault="qualified">...

9 Approach #2 Express lists using the RELAX NG vocabulary

10 <grammar xmlns=" ns=" Afghanistan Albania Algeria...

11 Approach #3 Express lists using domain-specific vocabularies. The markup comes from terminology used by Subject Matter Experts (SMEs)

12 Afghanistan Albania Algeria...

13 Approach #4 Express lists using a generic list vocabulary

14 Afghanistan Albania Algeria...

15 Analysis of Each Approach

16 Approach #1 & Approach #2 Approach #1 and approach #2 make it easy to use a list for validation purposes. A schema simply imports the list schema and then the lists' values are immediately available for validating element content. Here is an XML Schema that imports the country list XML Schema and uses its simpleType as the datatype for the element: <xs:schema xmlns:xs=" targetNamespace=" xmlns:c=" elementFormDefault="qualified"> <xs:import namespace=" schemaLocation="countries.xsd" />

17 Approach #1 & Approach #2 Here is a RELAX NG schema that includes the country list RELAX NG schema and uses its define element as the datatype for the element: <grammar xmlns=" ns="

18 Approach #1 & Approach #2 If the schema doing the importing is an XML Schema then it can't use the list if it's expressed using RELAX NG. And vice versa. country list (xsd) country list (rng) Schema (xsd) Schema (rng)

19 Approach #1 & Approach #2 Although these two approaches enable the efficient usage of lists for validation, they are not the most efficient format for the myriad other ways that a list may be used (rendering in a pick list, merging with other lists, searching, and so forth). This is discussed further in the below analysis of approach #3.

20 Approach #3 Recall that approach #3 uses domain-specific terminology. This can be helpful to Subject Matter Experts (SMEs) as they maintain the lists. Validation can be accomplished using a Schematron schema. Here is a Schematron schema which validates that the content of the element matches one of the values in the country list: <sch:ns uri=" prefix="c" /> <sch:ns uri=" prefix="ex" /> The value of country-visited must be one of the countries in the countries' list.

21 Approach #3 With approach #3 the markup used to construct the list has semantics specific to the list: { { This makes possible the creation of programs that are readily understood, as they use terminology consistent with the domain. For example, the XSLT program on the following slide uses the country list to generate an HTML list of all countries

22 <xsl:stylesheet xmlns:xsl=" xmlns:c=" version="2.0"> Countries of the World Note the template match values. They match on: { {

23 Contrast with Approach #1 and Approach #2 Conversely, with approach #1 and approach #2 the markup used to construct the list has semantics that are specific to the schema language: { { { { { { Consequently programs must operate using schema terminology rather than domain terminology. For example, the XSLT program on the following slide generates an HTML list of all countries from the countries list specified by the XML Schema document

24 <xsl:stylesheet xmlns:xsl=" xmlns:xs=" version="2.0"> Countries of the World Note the template match values. Rather than the XSLT program operating on and elements, it operates on,,, and elements. This makes programming challenging and error-prone.

25 Approach #3 With approach #3 a list can be used as a building block (data component) which can be immediately dropped into other documents to create compound documents. For example, consider a list of religions, also structured using approach #3: Baha'i Buddhism Catholicism...

26 Approach #3 It's easy to construct a compound document comprised of the country and religion lists: Afghanistan Albania Algeria... Baha'i Buddhism Catholicism...

27 Approach #3 Due to the modularity provided by approach #3, it is possible to perform list-specific processing on this compound document. That is, a country-list- aware application would be able to extract the country list from this compound document and process it. Ditto for a religion-list-aware application. Afghanistan Albania Algeria... Baha'i Buddhism Catholicism... country-list-aware application religion-list-aware application

28 Constrast with Approach #1 and Approach #2 With approach #1 and approach #2 the XML vocabulary used to construct the list is the same regardless of the list. Here is the compound document using lists that are defined using the XML Schemas vocabulary: <xs:simpleType xmlns:xs=" name="countriesType">... <xs:simpleType xmlns:xs=" name="religionsType">... Applications can't distinguish the country list from the religion list. The namespace used by the country list cannot be distinguished from the namespace used by the religion list. Thus, the benefits namespaces provide in terms of modularity are negated. It is not easy to create country-list-aware applications or religion-list-aware applications.

29 Approach #3 Approach #3 has minimal markup overhead.

30 Approach #4 In this approach the vocabulary is not customized for a specific list as with approach #3; rather, it is a vocabulary for any list. An element in an XML instance document can be validated against the list using Schematron in the same manner described in Approach #3. With the other approaches, the vocabulary is identified via a namespace. Approach #4 doesn't use namespaces; instead, it uses data to identify the list: This data indicates that this is a list of countries

31 Identifying a Vocabulary via a Namespace versus Identifying a Vocabulary via a data

32 Identifying a Vocabulary via a Namespace One way of identifying an XML building block (data component) is by namespace. For example, this list component is identified by the namespace Sunday Monday Tuesday Wednesday Thursday Friday Saturday This list is identified by the namespace Dentist Doctor Boss Applications can be built that are namespace-aware. Different data components can be mashed together into a single document and still be extracted and processed individually because each is in a namespace.

33 Identifying a Vocabulary via Data There is an alternate way of identifying an XML building block (data component): by embedding an identifier within the document, as data. The weekday list could be expressed like this: Sunday Monday Tuesday Wednesday Thursday Friday Saturday And the meetings list could be expressed like this: Dentist Doctor Boss

34 Cont. Things to note on the previous slide: 1. Namespaces are not being used. 2. The list is identified by the content of 3. The same XML vocabulary is used for both lists. (In fact, the same XML vocabulary is used for all lists) The two lists can be brought together into a single document and still be processed individually. Applications partition the document based on the value in

35 Analysis The namespace approach has the benefit of being widely adopted. Most XML tools, parsers, and technologies are based on namespaces. For example, NVDL is entirely based on using namespaces to partition a compound document; an XSLT processor processes a document based on the XSLT namespace.

36 Cont. By using data to identify a list (rather than namespaces) the same XML vocabulary can be used for all lists which makes all list-processing algorithms and code independent of the content, allowing one to leverage a single investment in software and access all code lists. However, that raises an interesting question: is content-specific processing easier when lists are expressed using a domain-specific vocabulary or when lists are expressed using a generic vocabulary?

37 Analysis of all Approaches Regardless of which approach is used, the meaning of the list and its values must be clearly documented. It may be challenging to achieve consensus on meaning: –The same terminology may be used by different people to mean the same thing. For example, one person expects to see Puerto Rico in a country list, whereas another person does not. This is because one person defines "country" only as principal sovereignties whereas another person defines "country" to include territories and protectorates. –Further, some people use different terminology to mean the same thing. For example, one person calls it "country" another calls it "principality." With all approaches the issue arises of which terminology and definitions to adopt.

38 Recommendation Each of the four approaches has pros and cons so, as always, be sure to understand the alternatives and decide which is best for your situation.

39 genericode genericode is a standardized generic list vocabulary [1]. That is, it is an example of approach #4. Here's the idea behind the design of genericode's vocabulary: –Oftentimes when creating a list there are multiple ways to express each value in the list. For example, in a list of countries we may express the first value as Afghanistan or AF. genericode permits each value to be expressed in multiple ways. Thus, the list is expressed in terms of rows and columns - each row has a column for the multiple ways to express a list value. [1]

40 uniquely identify this list here AF AFGHANISTAN AL ALBANIA...

41 Acknowledgements Thanks to the following people for contributing to this document: –Roger Costello –Bruce Cox –Ken Holman –Rick Jelliffe –Michael Kay –Rob Simmons