Download presentation
Presentation is loading. Please wait.
1
2003.09.30 - SLIDE 1IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/is202/f03/ SIMS 202: Information Organization and Retrieval Lecture 11: Intro to Database Design
2
2003.09.30 - SLIDE 2IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
3
2003.09.30 - SLIDE 3IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
4
2003.09.30 - SLIDE 4IS 202 – FALL 2003 Streams vs. Clips
5
2003.09.30 - SLIDE 5IS 202 – FALL 2003 Stream-Based Representation Makes annotation pay off –The richer the annotation, the more numerous the possible segmentations of the video stream Clips –Change from being fixed segmentations of the video stream, to being the results of retrieval queries based on annotations of the video stream Annotations –Create representations which make clips, not representations of clips
6
2003.09.30 - SLIDE 6IS 202 – FALL 2003 Keywords vs. Semantic Descriptors dog, biting, Steve
7
2003.09.30 - SLIDE 7IS 202 – FALL 2003 Why Keywords Don’t Work Are not a semantic representation Do not describe relations between descriptors Do not describe temporal structure Do not converge Do not scale
8
2003.09.30 - SLIDE 8IS 202 – FALL 2003 Natural Language vs. Visual Language Jack, an adult male police officer, while walking to the left, starts waving with his left arm, and then has a puzzled look on his face as he turns his head to the right; he then drops his facial expression and stops turning his head, immediately looks up, and then stops looking up after he stops waving but before he stops walking.
9
2003.09.30 - SLIDE 9IS 202 – FALL 2003 After Capture: Media Streams
10
2003.09.30 - SLIDE 10IS 202 – FALL 2003 Media Streams Features Key features –Stream-based representation (better segmentation) –Semantic indexing (what things are similar to) –Relational indexing (who is doing what to whom) –Temporal indexing (when things happen) –Iconic interface (designed visual language) –Universal annotation (standardized markup schema) Key benefits –More accurate annotation and retrieval –Global usability and standardization –Reuse of rich media according to content and structure
11
2003.09.30 - SLIDE 11IS 202 – FALL 2003 Video Retrieval In Media Streams Same interface for annotation and retrieval Assembles responses to queries as well as finds them Query responses use semantics to degrade gracefully
12
2003.09.30 - SLIDE 12IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
13
2003.09.30 - SLIDE 13IS 202 – FALL 2003 What is a Database?
14
2003.09.30 - SLIDE 14IS 202 – FALL 2003 Files and Databases File: A collection of records or documents dealing with one organization, person, area or subject (Rowley) –Manual (paper) files –Computer files Database: A collection of similar records with relationships between the records (Rowley) –Bibliographic, statistical, business data, images, etc.
15
2003.09.30 - SLIDE 15IS 202 – FALL 2003 Database A Database is a collection of stored operational data used by the application systems of some particular enterprise (C.J. Date) –Paper “Databases” Still contain a large portion of the world’s knowledge –File-Based Data Processing Systems Early batch processing of (primarily) business data –Database Management Systems (DBMS)
16
2003.09.30 - SLIDE 16IS 202 – FALL 2003 Why DBMS? History –50’s and 60’s all applications were custom built for particular needs –File based –Many similar/duplicative applications dealing with collections of business data –Early DBMS were extensions of programming languages –1970 - E.F. Codd and the Relational Model –1979 - Ashton-Tate and first Microcomputer DBMS
17
2003.09.30 - SLIDE 17IS 202 – FALL 2003 File Based Systems Naughty Nice Just what asked for Coal Estimation Delivery List Application File Toys Addresses Toys
18
2003.09.30 - SLIDE 18IS 202 – FALL 2003 From File Systems to DBMS Problems with file processing systems –Inconsistent data –Inflexibility –Limited data sharing –Poor enforcement of standards –Excessive program maintenance
19
2003.09.30 - SLIDE 19IS 202 – FALL 2003 DBMS Benefits Minimal data redundancy Consistency of data Integration of data Sharing of data Ease of application development Uniform security, privacy, and integrity controls Data accessibility and responsiveness Data independence Reduced program maintenance
20
2003.09.30 - SLIDE 20IS 202 – FALL 2003 Terms and Concepts Data independence –Physical representation and location of data and the use of that data are separated The application doesn’t need to know how or where the database has stored the data, but just how to ask for it Moving a database from one DBMS to another should not have a material effect on application program Recoding, adding fields, etc. in the database should not affect applications
21
2003.09.30 - SLIDE 21IS 202 – FALL 2003 Database Environment CASE Tools DBMS User Interface Application Programs Repository Database
22
2003.09.30 - SLIDE 22IS 202 – FALL 2003 Database Components DBMS =============== Design tools Table Creation Form Creation Query Creation Report Creation Procedural language compiler (4GL) ============= Run time Form processor Query processor Report Writer Language Run time User Interface Applications Application Programs Database Database contains: User’s Data Metadata Indexes Application Metadata
23
2003.09.30 - SLIDE 23IS 202 – FALL 2003 Types of Database Systems PC databases Centralized database Client/server databases Distributed databases Database models
24
2003.09.30 - SLIDE 24IS 202 – FALL 2003 PC Databases E.g.: Access FoxPro Dbase Etc.
25
2003.09.30 - SLIDE 25IS 202 – FALL 2003 Centralized Databases Central Computer
26
2003.09.30 - SLIDE 26IS 202 – FALL 2003 Client Server Databases Network Client Database Server
27
2003.09.30 - SLIDE 27IS 202 – FALL 2003 Distributed Databases computer Location A Location C Location B Homogeneous Databases
28
2003.09.30 - SLIDE 28IS 202 – FALL 2003 Distributed Databases Local Network Database Server Client Comm Server Remote Comp. Remote Comp. Heterogeneous Or Federated Databases
29
2003.09.30 - SLIDE 29IS 202 – FALL 2003 Terms and Concepts A “database application” is an application program (or set of related programs) that is used to perform a series of database activities: –Create Add new data to the database –Read Read current data from the database –Update Update or modify current database data –Delete Remove current On behalf of database users
30
2003.09.30 - SLIDE 30IS 202 – FALL 2003 Terms and Concepts Enterprise –Organization Entity –Person, Place, Thing, Event, Concept... Attributes –Data elements (facts) about some entity –Also sometimes called fields or items or domains Data values –Instances of a particular attribute for a particular entity
31
2003.09.30 - SLIDE 31IS 202 – FALL 2003 Terms and Concepts Key –An attribute or set of attributes used to identify or locate records in a file Primary Key –An attribute or set of attributes that uniquely identifies each record in a file
32
2003.09.30 - SLIDE 32IS 202 – FALL 2003 Terms and Concepts Models –(1) Levels or views of the Database Conceptual, logical, physical –(2) DBMS types Relational, Hierarchic, Network, Object-Oriented, Object-Relational
33
2003.09.30 - SLIDE 33IS 202 – FALL 2003 Models (1) Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model More later on this…
34
2003.09.30 - SLIDE 34IS 202 – FALL 2003 Data Models(2): History Hierarchical Model (1960’s and 1970’s) –Similar to data structures in programming languages Books (id, title) PublisherSubjects Authors (first, last)
35
2003.09.30 - SLIDE 35IS 202 – FALL 2003 Data Models(2): History Network Model (1970’s) –Provides for single entries of data and navigational “links” through chains of data. SubjectsBooks Authors Publishers
36
2003.09.30 - SLIDE 36IS 202 – FALL 2003 Data Models(2): History Relational Model (1980’s) –Provides a conceptually simple model for data as relations (typically considered “tables”) with all data visible
37
2003.09.30 - SLIDE 37IS 202 – FALL 2003 Data Models(2): History Object Oriented Data Model (1990’s) –Encapsulates data and operations as “Objects” Books (id, title) PublisherSubjects Authors (first, last)
38
2003.09.30 - SLIDE 38IS 202 – FALL 2003 Data Models(2): History Object-Relational Model (1990’s) –Combines the well-known properties of the Relational Model with such OO features as: User-defined datatypes User-defined functions Inheritance and sub-classing All of the major enterprise DBMS systems are now Object-Relational or incorporate Object-Relational features
39
2003.09.30 - SLIDE 39IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
40
2003.09.30 - SLIDE 40IS 202 – FALL 2003 Database System Life Cycle Growth, Change, & Maintenance 6 Operations 5 Integration 4 Design 1 Conversion 3 Physical Creation 2
41
2003.09.30 - SLIDE 41IS 202 – FALL 2003 Design Determination of the needs of the organization Development of the Conceptual Model of the database –Typically using Entity-Relationship diagramming techniques Construction of a Data Dictionary Development of the Logical Model
42
2003.09.30 - SLIDE 42IS 202 – FALL 2003 Physical Creation Development of the Physical Model of the Database –Data formats and types –Determination of indexes, etc. Load a prototype database and test Determine and implement security, privacy and access controls Determine and implement integrity constraints
43
2003.09.30 - SLIDE 43IS 202 – FALL 2003 Conversion Convert existing data sets and applications to use the new database –May need programs, conversion utilities to convert old data to new formats
44
2003.09.30 - SLIDE 44IS 202 – FALL 2003 Integration Overlaps with Phase 3 Integration of converted applications and new applications into the new database
45
2003.09.30 - SLIDE 45IS 202 – FALL 2003 Operations All applications run full-scale Privacy, security, access control must be in place Recovery and Backup procedures must be established and used
46
2003.09.30 - SLIDE 46IS 202 – FALL 2003 Growth, Change, and Maintenance Change is a way of life –Applications, data requirements, reports, etc. will all change as new needs and requirements are found –The Database and applications and will need to be modified to meet the needs of changes
47
2003.09.30 - SLIDE 47IS 202 – FALL 2003 Another View of the Life Cycle Operations 5 Conversion 3 Physical Creation 2 Growth, Change 6 Integration 4 Design 1
48
2003.09.30 - SLIDE 48IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
49
2003.09.30 - SLIDE 49IS 202 – FALL 2003 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model
50
2003.09.30 - SLIDE 50IS 202 – FALL 2003 Entity An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information –Persons (e.g.: customers in a business, employees, authors) –Things (e.g.: purchase orders, meetings, parts, companies) Employee
51
2003.09.30 - SLIDE 51IS 202 – FALL 2003 Attributes Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (this is the Metadata for the entities) Employee Last Middle First Name SSN Age Birthdate Projects
52
2003.09.30 - SLIDE 52IS 202 – FALL 2003 Relationships Relationships are the associations between entities They can involve one or more entities and belong to particular relationship types
53
2003.09.30 - SLIDE 53IS 202 – FALL 2003 Relationships Class Attends Student Part Supplies project parts Supplier Project
54
2003.09.30 - SLIDE 54IS 202 – FALL 2003 Types of Relationships Concerned only with cardinality of relationship Truck Assigned EmployeeProject Assigned EmployeeProject Assigned Employee 11 n n 1 m Chen ER notation
55
2003.09.30 - SLIDE 55IS 202 – FALL 2003 Other Notations Truck Assigned EmployeeProject Assigned EmployeeProject Assigned Employee “Crow’s Foot”
56
2003.09.30 - SLIDE 56IS 202 – FALL 2003 Other Notations Truck Assigned EmployeeProject Assigned EmployeeProject Assigned Employee IDEFIX Notation
57
2003.09.30 - SLIDE 57IS 202 – FALL 2003 More Complex Relationships Project Evaluation Employee Manager 1/n/n 1/1/1 n/n/1 Project Assigned Employee 4(2-10) 1 SSNProjectDate Manages Employee Manages Is Managed By 1 n
58
2003.09.30 - SLIDE 58IS 202 – FALL 2003 Weak Entities Owe existence entirely to another entity Order-line Contains Order Invoice # Part# Rep# QuantityInvoice#
59
2003.09.30 - SLIDE 59IS 202 – FALL 2003 Supertype and Subtype Entities Clerk Is one of Sales-rep Invoice Other Employee Sold Manages
60
2003.09.30 - SLIDE 60IS 202 – FALL 2003 Many to Many Relationships Employee Project Is Assigned Project Assignment Assigned SSN Proj# SSN Proj# Hours
61
2003.09.30 - SLIDE 61IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Discussion Next Time/Readings
62
2003.09.30 - SLIDE 62IS 202 – FALL 2003 Questions: Brooke Maury Discussion Questions on Hoffer & McFadden: The relational database model has remained fairly static since its inception in the 1970’s. Is this evidence of its strength as an organizational model or an indication of its inflexibility?
63
2003.09.30 - SLIDE 63IS 202 – FALL 2003 Questions: Brooke Maury If the goal of the relational database model is to encode a ‘conceptual’ design into a logical design, is it possible that improved technology and the development of new modeling techniques will supplant the RDBMS? Specifically, what impact will XML and the development of document engineering have on organizing information in multiple normalized tables? Conversely, what does the relational model have that would be lost if a conceptual design was encoded in another model?
64
2003.09.30 - SLIDE 64IS 202 – FALL 2003 Questions: Brooke Maury (Next time?) The drive to develop the RDBM was in part motivated by a need to minimize the space required and improve the performance of database systems by removing redundancies. What impact will very inexpensive data storage and computing power have on the relational database model and the third normal form especially?
65
2003.09.30 - SLIDE 65IS 202 – FALL 2003 Questions: Shane Ahern Discussion Questions for "Logical Database Design and the Relational Model" Is the normalization process described really necessary? When I design a database schema, I find that by thinking of tables in terms of they entities they represent (employees, sales, events), I avoid most of the problems of normalization that the process seeks to address (i.e. salesperson and region in Sales table, salesperson is clearly a distinct entity from sales). If the formal process described in the article is not followed, are there potential pitfalls that might lead to problems with your database schema?
66
2003.09.30 - SLIDE 66IS 202 – FALL 2003 Questions: Shane Ahern The article points out that "the relational model does not yet directly support supertype/subtype relationships." Once the tables in a relational database have been decomposed to third normal form, the database is efficient from systems point-of-view, but the tables no longer represent a representation of the data that is intuitive to humans. The object-oriented model more accurately mirrors the way we think about the concepts that we wish to store in databases. So perhaps object-oriented database systems are worth considering. What about XML databases?
67
2003.09.30 - SLIDE 67IS 202 – FALL 2003 Questions: Arthur Law The three models that we have been presented with, Entity Relationship Model, NIAM Model, and Object Oriented Model all enforce a specific thought process in the organization and relationship between items in a database. With all of our recent discussion of computers understanding natural language are these methods now out of date with how we should be organizing information? Should we use artificial intelligence or learning algorithms to statistically determine the relationship between entities or is there still value in using these models?
68
2003.09.30 - SLIDE 68IS 202 – FALL 2003 Questions: Arthur Law Each model is approximately one decade apart in development and a quick Google search shows that companies are using databases with one of the three models. However, as new models arise there doesn't seem too much interest in migrating from one data model to another. Which makes sense given that an organization using a given model probably finds that it works. Now with the proliferation of XML, we see more information being shared between organizations, so are we fated for an expensive and lengthy translation process between databases? Or should all DB administrators be responsible for upgrading to the latest model?
69
2003.09.30 - SLIDE 69IS 202 – FALL 2003 Lecture Overview Review –MediaStreams Databases and Database Design Database Life Cycle ER Diagrams Database Design Discussion Next Time/Readings
70
2003.09.30 - SLIDE 70IS 202 – FALL 2003 Next Time Database Design – Normalization and SQL Readings (no additional DBMS readings) Additional Questions/ or revisit some of today’s discussion questions in the light of the next lecture?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.