Extensible Markup Comes of Age in XHTML Don Kiely Software Technologist Third Sector Technologies Fairbanks, Alaska 6-306
Me.About Software Technologist for Third Sector Technologies in Fairbanks, Alaska Software Technologist for Third Sector Technologies in Fairbanks, Alaska Develop software and Web applications Develop software and Web applications Business and technology consulting Business and technology consulting
My Other Jobs Author Author Several books, including VB Programmer’s Guide to the Windows Registry Regular contributor to several publications, including Informant’s Microsoft Office & VBA Developer, VBPJ, and Information Week Training Training VB, VBA, VI, and SQL Server instructor for Application Developers Training Company
The Death of HTML at Last Not! Not! HTML has fueled one of the greatest transfers of technology in human history HTML has fueled one of the greatest transfers of technology in human history But it is inflexible, standards don’t keep up, and vendors go wild adding extensions But it is inflexible, standards don’t keep up, and vendors go wild adding extensions Ancient technology in Internet time Nothing will kill off HTML any time soon Nothing will kill off HTML any time soon AND we’ll be doing the ugly mix of HTML and scripting code too But the wild and wooly days are over But the wild and wooly days are over
XML Touted as HTML killer Touted as HTML killer But HTML is a formatting markup language, while XML is a data markup Believes its own press releases Believes its own press releases Key is the X: extensible Key is the X: extensible Make your own tags Much more rigid than HTML: follow the rules Much more rigid than HTML: follow the rules Well-formed Valid
Two Great Tastes That Taste Great Together XML + HTML = XHTML XML + HTML = XHTML Extensible Hypertext Markup Language ‘A Reformulation of HTML 4 in XML 1.0’ ‘A Reformulation of HTML 4 in XML 1.0’ HTML rewritten as XML document type declarations HTML rewritten as XML document type declarations Both have roots in SGML Original recommendation published by W3C on 26 January 2000 Original recommendation published by W3C on 26 January 2000
Introducing XHTML A ‘bridge to the future’ A ‘bridge to the future’ Promises to make Web sites more adaptable while supporting existing sites Promises to make Web sites more adaptable while supporting existing sites Three major advantages Three major advantages Extensibility: supports new and custom tags Portability: interoperability, Web on any device Modularity: support for subset of features Isn’t the HTML editor ‘XHTML’ Isn’t the HTML editor ‘XHTML’
XHTML DTDs Specifies three XML document types: correspond to three HTML 4.01 DTDs Specifies three XML document types: correspond to three HTML 4.01 DTDs Strict: Formatting in Cascading Style Sheets <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> "DTD/xhtml1-strict.dtd"> Transitional: presentational markup, so don't limit to browsers that support CSS <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> "DTD/xhtml1-transitional.dtd"> Frameset: documents with frames <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "DTD/xhtml1-frameset.dtd"> "DTD/xhtml1-frameset.dtd">
Strict XHTML Document Restricted to tags and attributes from the XHTML 1.0 namespace Restricted to tags and attributes from the XHTML 1.0 namespace Must validate against one of the three DTDs Root element of the document must be Root element of the document must be Root element of the document must designate an XHTML 1.0 namespace using the xmlns attribute Must be a DOCTYPE declaration in the document prior to the root element. If present, the public identifier included in the DOCTYPE declaration must reference one of the three required DTDs
XHTML Document Structure Basic structure Basic structure DOCTYPE Doesn’t affect HTML: just when validated Head Body </html>
Minimal XHTML <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd"> "DTD/xhtml1-strict.dtd"> <head> Virtual Library Virtual Library </head><body> Moved to vlib.org. Moved to vlib.org. </body></html>
XHTML and Other Namespaces No longer strictly conforming No longer strictly conforming <html xmlns=" xml:lang="en" lang="en"> xml:lang="en" lang="en"> A Math Example A Math Example The following is MathML markup: The following is MathML markup: 3 3 x x......
Incorporating Into Other Namespaces <book xmlns='urn:loc.gov:books' xmlns:isbn='urn:ISBN: ' xmlns:isbn='urn:ISBN: ' xml:lang="en" lang="en"> xml:lang="en" lang="en"> Cheaper by the Dozen Cheaper by the Dozen This is also available This is also available online. online. </book>
User Agent Conformance Must parse for well-formedness Must parse for well-formedness When parsing as XML, only recognize attributes of type ID as fragment identifiers When parsing as XML, only recognize attributes of type ID as fragment identifiers Rules for graceful degradation when doesn’t recognize elements Rules for graceful degradation when doesn’t recognize elements Whitespace rules Whitespace rules It’ll take time for agents to enforce these rules
Not Your Grandma’s HTML Some substantial differences from HTML Some substantial differences from HTML Some from sloppiness allowed by HTML Others from the XML way of doing things Validation and conformance issues Parsers will be much trimmer and easier to write Either XHTML code will work, or it won’t Document structure Document structure Root element must be and must designate the XHTML 1.0 namespace and elements cannot be omitted element must be the first element in the element
More XHTML Differences Well-formed, strictly complying with syntax rules Well-formed, strictly complying with syntax rules Tags must be nested properly All tags must have closing tags or written in a special form that combines the opening and closing tag Empty elements must either have an end tag, or the start tag must end with /> Sometimes called a self-terminating element or or
XHTML Case/Attributes XML is case-sensitive, and the XHTML DTDs are written in lower case XML is case-sensitive, and the XHTML DTDs are written in lower case Element and attribute names must be lower case User-defined attribute values, can be in any case All attribute values, including those that appear to be numeric, must be quoted in single or double quotes: All attribute values, including those that appear to be numeric, must be quoted in single or double quotes:
Nested XHTML Tags Elements must also be properly nested, so that closing tags must be in reverse order of the opening tags Elements must also be properly nested, so that closing tags must be in reverse order of the opening tags Unacceptable because of the reversed closing tags: Unacceptable because of the reversed closing tags: Italicized paragraph Italicized paragraph Properly nested tags: Properly nested tags: Italicized paragraph Italicized paragraph
Minimized Attributes XHTML does not allow minimized attributes XHTML does not allow minimized attributes Attribute is minimized when there is only one value for it Unacceptable in XHTML: Unacceptable in XHTML: Without attribute minimization: Without attribute minimization:
Comments XML is not required to preserve comments, so can’t hide script this way XML is not required to preserve comments, so can’t hide script this way < and & treated as start of markup Use Use
id and name Attributes Used as fragment identifiers Used as fragment identifiers XML fragment identifiers are ID XML fragment identifiers are ID So in XHTML, id attribute is type ID name is formally deprecated, so don’t count on it appearing in future versions of HTML
Element Prohibitions a cannot contain other a elements. a cannot contain other a elements. pre cannot contain the img, object, big, small, sub, or sup elements. pre cannot contain the img, object, big, small, sub, or sup elements. button cannot contain the input, select, textarea, label, button, form, fieldset, iframe or isindex elements. button cannot contain the input, select, textarea, label, button, form, fieldset, iframe or isindex elements. label cannot contain other label elements. label cannot contain other label elements. form cannot contain other form elements. form cannot contain other form elements.
Moving to XHTML Migrate existing pages or start over? Migrate existing pages or start over? Users can view carefully crafted XHTML in latest versions of today’s browsers XTHML media types supported by most browsers text/html, text/xml, application/xml Scripting code that uses the HTML or XML document object models will work just fine Conversion tools Conversion tools Poor HTML translates poorly
Modifying HTML Pages Biggest time sinks Biggest time sinks Converting tags/attributes to lower case Quoting attributes The cleaner the HTML, and closer it is to HTML 4 standard, the better The cleaner the HTML, and closer it is to HTML 4 standard, the better
Compatibility Guidelines Feature of HTML 4 specification Feature of HTML 4 specification Same codebase can be used with XHTML-compliant browsers as well as those supporting straight HTML But avoid using new tag definitions
Migrating Existing Content XHTML 1.0 produces HTML 4 code, but some user agents won’t render properly XHTML 1.0 produces HTML 4 code, but some user agents won’t render properly Appendix C of XHTML 1.0 spec Appendix C of XHTML 1.0 spec
XHTML 1.1 Now under development Now under development Biggest change will be further modularization Biggest change will be further modularization Well-defined sub- and supersets of XHTML for various devices This is how XHTML will allow going beyond HTML Combine with Composite Capability/ Preference Profiles (CCPP) to bring mobile devices fully to the Web Combine with Composite Capability/ Preference Profiles (CCPP) to bring mobile devices fully to the Web
Resources 1 Standards in various stages Standards in various stages XHTML 1.0: XHTML 1.1: HTML 4.0: XHTML Basic, subset of XHTML for handheld devices: XHTML Modularization: XHTML Events Module: XHTML Document Profile Requirements: Building XHTML Modules:
Resources 2 XHTML tools and support XHTML tools and support XHTML and HTML validator: Place a link to on your Web page Clean up Web pages with HTML Tidy: HTML Kit Web editor, with support for HTML Tidy: XHTML.ORG, a Web site with news and information: (but is getting dated) Public list about HTML, hosted and archived by W3C:
Questions? Thanks for attending! Please remember to fill out the evaluation forms! How can I make this a better presentation? Don Kiely Third Sector Technologies