XML, CM, and KM KMWorld 2001 Thursday November 1, 2001 Darlene Fichter Data Library Coordinator University of Saskatchewan Libraries Frank Cervone Assistant University Librarian for Information Technology Northwestern University
Agenda What is XML? What does it offer? What are some of the weaknesses? Trends in XML, CM, and KM
Why XML? A critical component of KM involves knowledge representation and codification To support knowledge activities, computers must have access to structured collections of information and sets of inference rules that they can use to conduct automated reasoning
What is XML? Structured data interchange – A common syntax for expressing structure in data Designed to account for “unstructured” data – Documents Inherently conveys meaning/structure Content and process separate from structure Delivered via standard text files
XML Example – Rich Site Summary book news Book news - headlines from around the web, refreshed every 15 minutes en-us
Headlines 'Author Unknown' by Don Foster Salon Nov :51AM
XML is open Open standards NOT proprietary Platform neutral, license-free and widely supported Influenced by a number of standards organization Agreement on a number of core standards in the XML family
XML strengths Flexible – Make collaborative information exchange simpler Less expensive implementation – Light-weight software modules Separates content from processing Easily internationalized – Full Unicode support Enables complex information retrieval
XML is flexible Very flexible – you can define your own languages, vocabulary, and metadata Easily extended by adding additional elements (fields) and attributes Data description can be sent with the data
XML enables less expensive implementation Implementation tools are modularized – XML browser can be implemented in less than 200K – HTML browser > 4MB to 80 MB Standard syntax makes processing easier and therefore less expensive – Simple implementation of “validity checking” Lower cost – Allow small and medium-sized organizations to participate in data exchange initiatives
XML separates content from process Doesn’t impose a particular manner for processing Doesn’t impose constraints on how to handle information Same data can be used in web page, hand held device through simple “transformations” – “loosely coupled” – “future proof”
XML is easily internationalized Unicode standard supports a wide range of languages and scripts – Latin (Western and Eastern European, non-western languages) – Greek – Cyrillic – Hebrew – Arabic – Armenian – Georgian – Thai – Lao – Hangul (Korean) – Ideographs (Chinese, Japanese, Korean) – Hiragana and Katakana (Japanese) – Cherokee – Khmer – Ethiopian
XML enables complex information retrieval Supports encoding of metadata through both standardized and constructed tag sets
XML downsides Space, processor, and bandwidth hog Just a document syntax, not a full-fledged programming language Doesn’t work for binary data Is a regression from centralized and efficient databanks Specifications are not complete
XML – just one part of the puzzle
XML and content management CM systems repositories use XML for tagging and storing information CM systems use XML as a standard protocol for integration with other applications XML is invisible to the information creator – XML markup created as the information is captured
Emerging Standards For KM XTM OPML RFML FLBC Industry specific standards: Legal Publishing Scientific research
XTM: Topic Maps Topic maps are a new ISO standard for describing knowledge structures and associating them with information resources Used to organize information into knowledge bases “GPS” for information “A book without an index is like a country without a map”
OPML Outline Processor Markup Language – Outline-structured information Used for data the is easily browsed and editable – Specifications – Legal briefs – Product plans – Presentations – Screenplays – Directories
RFML Relational-functional markup language Used to define relationship and functions among data elements – Tables within relational databases – Relational views
FLBC Formal Language for Business Communication – Automated communication – Conversation management – Dialog management – Based on speech act theory Formally defined message types Broad range of message types Defined in terms of intentions Clear delineation between message type and content
XML in Use Portals Content management & syndication Content management: industry sector Integration Analytical/decision making Search and retrieval Visualization
Questions