Organizing Information Digitally Norm Friesen
Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance & modularity Markup: serial & hierarchical
General Properties Multiple Axes and access points –Allow for different views Form & Content can (should) be separate Formatting can be used for analysis & organization of data Instructions and data can be combined; –effects of instructions are difficult to control Database software for each type
Examples Relational: library catalogue, Amazon.com, hotel reservation system Markup: Web pages & Google, Blogs & RSS, Object-Oriented: programs of all kinds; Windows XP, Office, etc. Java Programming langauge
Relational Tables and links Table: “a systematic arrangement of data usually in rows and columns for ready reference” Represents a category or example, rather than a specific instance of that category. Entities can be thought of (roughly) as nouns.
Deriving tables from text Tabs, commas and hard returns (paragraphs) are often used to indicate rows and columns in a table Data in this format often called “flat files.” Can be used as a way of getting data “into” a database: make a list into a database table
Relational, con’t An entity described in a table can be related to other entities –E.g. person and membership card(s) This relationship can be: –One to one –One to many –Many to many
Primary Key Primary Key: a field that uniquely identifies each record stored in a table. This field is often automatically numbered; it cannot contain any empty, blank or null values.
Definition: Relation Relation: A connection between two tables, each describing an entity that interacts with the other. In the example above, users (described in the first table) compose and send messages (described in the second table). The values for the primary key for one of these entities is stored in two places: in its own table, and as a foreign key in the related table.
Many to Many: Junction Table
Activity: a 2-Table database Think of examples Look at examples for the database application project Include primary and foreign key Make sure that you use the correct relation type
Relational Data: Other Characteristics Particular means of querying: SQL or Standard Query Language –ISO/IEC 9075; Information Technology - Database Languages Not good at representing complex relationships and some kinds of entities/data –Complexity can sometimes be accommodated at the price of performance –Multimedia not easy to accommodate
Object-Orientation Way of organizing and conceptualizing information largely for the purposes of programming Programming: the creation of step-by- step list of instructions written for a particular computer environment in a particular language.
Object Orientation: Characteristics Modular: Black boxes with a standardized interface; encapsulation Classes and inheritance: part of producing and modifying program components Operation: what the object can do
Object Orientation: Modular Bugs tend to arise from unexpected consequences of relations between parts of a program –Simplify relations by defining modular program components that relate to one another through clearly defined interfaces. –Programmers and program components only deal with the interface, not the module or object contents.
Object Orientation: Classes A class is a pattern, template, or blueprint for a category of structurally identical items. The items created using the class are called instances. This is often referred to as the "class as a `cookie cutter'" view. As you might guess, the instances are the "cookies.” (
Object Orientation: Inheritance “In an object-oriented context, we speak of specializations as "inheriting" characteristics from their corresponding generalizations. Inheritance can be defined as the process whereby one object acquires (gets, receives) characteristics from one or more other objects.”
Object-oriented Databases data is stored as objects it can be interpreted only using the methods, usually specified by its class. The relationship between similar objects is preserved (inheritance) as are references between objects.methodsclassinheritance
Object oriented Databases Doesn’t translate well into SQL data: Object-SQL Impedance Mismatch “As an industry, ODBMS were long considered to be a lost opportunity to revolutionize software development. Since 2004, object databases have seen a renaissance when open source object databases appeared…”open source
Markup Languages Markup refers to the use of a markup language to describe the structure and appearance of a particular document. –HTML: describes the appearance of documents –XML: geared to the description of the structure of documents –There are many types of documents, so many derivatives from XML exist
Markup, con’t Used for both documents and records Both XML and HTML derived from SGML, “Standardized General Markup Language” (1960’s). A language for formulating languages –XML (1996): a simplified subset of SGML –HTML (1992): very simplified subset; XHTML conforms to XML
Markup, con’t A Tale of Two Cities SERIAL & HIERARCHICAL: Stephen's Web (validation)
XML OpenDoc: for office documents Doc book: for manuals XrML: for enforceable copyright statements RSS: for news/posting syndication MathML: for formatting mathematical formulations RuleML: expressing formal rules for processing information, etc.
DTD/Schema, Document, XSLT
XML, con’t Repetition of elements within repetitions. XML databases –Relational/hybrid –“Native” –XQuery
Summary Three forms of organizing information Each is flexible and powerful, but only within specific domains/purposes Most widespread database technologies are relational But the other two forms (markup and object- oriented) do not translate easily into this format.