Presentation is loading. Please wait.

Presentation is loading. Please wait.

LIS508 lecture 2 Thomas Krichel 2003-10-07. today's lecture Recap on what we did last week. Encoding mark-up Databases.

Similar presentations


Presentation on theme: "LIS508 lecture 2 Thomas Krichel 2003-10-07. today's lecture Recap on what we did last week. Encoding mark-up Databases."— Presentation transcript:

1 LIS508 lecture 2 Thomas Krichel 2003-10-07

2 today's lecture Recap on what we did last week. Encoding mark-up Databases

3 Recap Computers deal with on/off signals called bits. Collections of these bits are binary numbers. Texts are (basically) strings of characters. To represent text, we need to represent characters. To make a characters understandable to a computer we associate a number with each character. The result is a character set.

4 Beyond characters There is more to text than a string of characters. There is layout –titles –abstracts –mathematical formula spacing

5 Layout Layout can be conveyed by additional text that has special meaning. Examples –LaTeX –HTML –PostScript Another way is to do non-textual layout by adding some other digital signals. Examples –DVI –MS Word –MS Powerpoint These can not be shown in these slides!

6 Example: LaTeX \bigskip\textbf{Class structure} Classes will be held in the computer lab in the Palmer School between 18:15 and 20:45. An optional practice session will last until 21:15. \begin{tabular}{@{}llll@{}} 0&2003--09--23&introduction to the course &\\ 1&2002--09--30&bits bytes and characters &\\ 2&2003--10--07&databases and markup languages&\\

7 Example: HTML Class structure Classes will be held in the computer lab in the Palmer School between 18:15 and 20:45. An optional practice session will last until 21:15. Class details: 0 2003–09–23 introduction to the course 1 2002–09–30 bits bytes and characters

8 Example: PostScript Fc(Class)g(structur)o(e)-104 3956 y Fd(Classes)26b(will)g(be)e(held)g(in)h(the)f(co mputer)f(lab)i(in)f(the)h(P)o(almer)f(School)g(be tween)f(18:15)h(and)g(20:45.)36 b(An)25 b(optional)e(practice)h(session)-104 4055 y(will)d(last)g(until)f(21:15.)-104 4155 y(Class)i(details:)-104 4307 y(0)141 b(2003\22609\22623)94b(introduction)18 b(to)i(the)h(course)-104 4407 y(1)141 b(2002\22609\22630)94 b(bits)21 b(bytes)f(and)g(characters)-104 4507 y(2)141 b(2003\22610\22607)94 b(databases)20 b(and)g(markup)e(languages)-

9 DVI (rendition, "class structure") 1659: fntnum27 current font is ptmb8t 1660: setchar67 h:=-820459+473168=-347291, hh:=-22 1661: setchar108 h:=-347291+182183=-165108, hh:=-10 1662: setchar97 h:=-165108+327680=162572, hh:=11 1663: setchar115 h:=162572+254928=417500, hh:=27 1664: setchar115 h:=417500+254928=672428, hh:=43 1665: right3 163840 h:=672428+163840=836268, hh:=53 1669: setchar115 h:=836268+254928=1091196, hh:=69 1670: setchar116 h:=1091196+218232=1309428, hh:=83 1671: setchar114 h:=1309428+290976=1600404, hh:=101 1672: setchar117 h:=1600404+364376=1964780, hh:=124 1673: setchar99 h:=1964780+290976=2255756, hh:=142 1674: setchar116 h:=2255756+218232=2473988, hh:=156 1675: setchar117 h:=2473988+364376=2838364, hh:=179 1676: setchar114 h:=2838364+290976=3129340, hh:=197 1677: right2 -11792 h:=3129340-11792=3117548, hh:=196 1680: setchar101 h:=3117548+290976=3408524, hh:=214

10 Databases Databases are collection of data with some organization to them. The classic example is the relational database. But not all database need to be relational databases.

11 Relational databases A relational database is a set of tables. There may be relations between the tables. Each table has a number of record. Each record has a number of fields. When the database is being set up, we fix –the size of each field –relationships between tables

12 Example: Movie database ID| title | director| date M1| Gone with the wind | F. Ford Coppola| 1963 M2| Room with a view| Coppola, F Ford| 1985 M3| High Noon| Woody Allan| 1974 M4 | Star Wars| Steve Spielberg| 1993 M5| Alien| Allen, Woody | 1987 M6| Blowing in the Wind| Spielberg, Steven| 1962 Single table No relations between tables, of course

13 Problem with this database All data wrong, but this is just for illustration. Name covered inconsistently. There is no way to find films by Woody Allan without having to go through all spelling variations. Mistakes are difficult to correct. We have to wade through all records, a masochists pleasure.

14 Better movie database ID| title | director| year M1| Gone with the wind | D1| 1963 M2| Room with a view| D1| 1985 M3| High Noon| D2| 1974 M4 | Star Wars| D3| 1993 M5| Alien| D2 | 1987 M6| Blowing in the Wind| D3| 1962 ID| director name| birth year D1| Ford Coppola, Francis| 1942 D2| Allan, Woody| 1957 D3| Spielberg, Steven| 1942

15 Relational database We have a one to many relationship between directors and film –Each film has one director –Each director has produced many films Here it becomes possible for the computer –To know which films have been directed by Woody Allen –To find which films have been directed by a director born in 1942

16 Many-to-many relationships Each film has one director, but many actors star in it. Relationship between actors and films is a many to many relationship. Here are a few actors ID| sex| actor name| birth year A1| f| Brigitte Bardot | 1972 A2| m| George Clooney| 1927 A3| f| Marilyn Monroe| 1934

17 Actor/Movie table actor id| movie id A1| M4 A2| M3 A3| M2 A1| M5 A1| M3 A2| M6 A3| M4 … as many lines as required

18 SQL Once we have the relational database, we can ask sophisticated questions: –Which director has had the most female actors working for him? –In which years films have been shot that starred actors born between 1926 and 1935? Such questions can be encoded in a language know as structured query language or SQL. All relational database vendors implement a dialect of SQL.

19 databases in libraries Relational databases dominate the world of structured data But not so popular in libraries –Slow on very large databases (such as catalogs) –Library data has nasty ad-hoc relationships, e.g. Translation of the first edition of a book CD supplement that comes with the print version Difficult to deal with in a system where all relations and field have to be set up at the start, can not be changed easily later.

20 http://openlib.org/home/krichel Thank you for your attention!


Download ppt "LIS508 lecture 2 Thomas Krichel 2003-10-07. today's lecture Recap on what we did last week. Encoding mark-up Databases."

Similar presentations


Ads by Google