The Information School of the University of Washington INFO-340: Database Management & Information Retrieval David Hendry Class L-02
The Information School of the University of Washington INFO-340: Class 2 2 Topics Information Systems Database systems: Short History Three-level ANSI-SPARC Architecture Functions of a DBMS
The Information School of the University of Washington Q & A Syllabus Assignment #1
The Information School of the University of Washington Information Systems
The Information School of the University of Washington INFO-340: Class 2 5 Information Systems Examples –Airline reservation system –ATM network –File system on a PC –CD collection at home –Museum or art gallery –Website –File sharing system –A personal stamp collection or family scrapbook
The Information School of the University of Washington INFO-340: Class 2 6 An Information System The resources that enable the collection, management, control, and dissemination of information throughout an organization
The Information School of the University of Washington INFO-340: Class 2 7 Components of Information System Stakeholders –Management –Division workers –Customers –Partners Inputs & Outputs –Traffic –Sales Data –Plans –Calendars & events –Part assemblies –Business transactions Procedures –Updating data –Transferring data
The Information School of the University of Washington INFO-340: Class 2 8 Components of Systems SupplierCustomer System Environment Input Output ProcessStakeholder
The Information School of the University of Washington INFO-340: Class 2 9 System Sub-system Boundary
The Information School of the University of Washington INFO-340: Class 2 10 Three Key Ideas Systems are hierarchical –Systems consist of sub-systems Systems are nearly decomposable –Interaction between subsystems is weak System boundaries are arbitrary –Where you set a boundary requires judgment
The Information School of the University of Washington INFO-340: Class 2 11 Class Exercise: Museum as Information System What questions should you answer?
The Information School of the University of Washington INFO-340: Class 2 12 Museum as Information System Who are the stakeholders? What is the environment? What are the inputs, processes & outputs? Where are the system boundaries? How does the system hierarchical decompose? Where does the strict decomposition fail? ‘ Where are the feedback loops?
The Information School of the University of Washington INFO-340: Class 2 13 Components of Systems Environment: Where the system operates System: Interacting components that work together to complete a function Subsystem: A system is made up of other systems (HIERARCHICAL) Boundary: What is inside and outside the system Inputs & Outputs: Material flowing into and out of a system Process: What gets done?
The Information School of the University of Washington Development Lifecycle
The Information School of the University of Washington INFO-340: Class 2 15 Development Lifecycle Define : Vision/scope Needs assessment Design : Invent the technological solution Develop : Build the technology Deploy : Delivery stable technology Vision/scope document Design specifications document Beta software Version Release
The Information School of the University of Washington INFO-340: Class 2 16 Database Development 1.Analysis of functional requirements 2.Conceptual design 3.Logical design 4.Physical design 5.Implement 6.Test 7.Maintain
The Information School of the University of Washington Database Systems
The Information School of the University of Washington INFO-340: Class 2 18 Evolution of Database Systems File-based systems (1950s – now) Application programs process files 1 st Generation (mid 1960s – mid 1980s) Hierarchical & Network databases 2 nd Generation (mid 1970s – now) Relational database systems 3 rd Generation (early 1990s – now) Object-oriented database systems
The Information School of the University of Washington INFO-340: Class 2 19 File Systems Application programs manage own data files and produce reports Collection of programs was often based on functional areas (payroll vs. personal)
The Information School of the University of Washington INFO-340: Class 2 20 File-Based Data Processing Payroll System Personal Data Tax Data Projects Data Project Management System Personal Data S1 S2
The Information School of the University of Washington INFO-340: Class 2 21 Weaknesses Program-data dependence Separation and isolation of data Duplication of data Incompatibility of files Many, many application programs
The Information School of the University of Washington INFO-340: Class 2 22 Key Lesson Learned 1.Program-data independence is good –Programs should not responsible for the definition of data formats 2.Centralized control of data access is good –Programs should not be responsible for security, access control, and certain kinds of data integrity
The Information School of the University of Washington INFO-340: Class st Generation: Record-Based DBMS To address these problems two types of databases were developed in the 60s and early 70s –Network data models –Hierarchical data models
The Information School of the University of Washington INFO-340: Class 2 24 Hierarchical/Network Data Model Courses Students Collections of ‘records’ Pointers used to create ‘sets’
The Information School of the University of Washington INFO-340: Class 2 25 Lessons Learned Better on –Data independence –Sharing data However, complex application programming –Chasing ‘pointers’ to navigate data
The Information School of the University of Washington INFO-340: Class nd Generation: Relational Model Data modeled as table, rows, columns No pointer chasing Grounded in theory (relational algebra)
The Information School of the University of Washington INFO-340: Class rd Generation: Object-Oriented Database Management Systems Domain objects (entities, relationships, etc.) modeled directly rather than with tables, rows, columns Very important in Engineering Domains
The Information School of the University of Washington Three-level ANSI-SPARC architecture
The Information School of the University of Washington INFO-340: Class 2 29
The Information School of the University of Washington INFO-340: Class 2 30 External Level Different users require different data views –Specific information for goals, job roles, etc. Some information is derived/calculated –Dynamic calculations (age) –Complex combinations of data
The Information School of the University of Washington INFO-340: Class 2 31 Conceptual Level What data is stored and the relationships between the data Key concerns: –Entities, attributes, relationships –Data types –Constraints –Security and integrity info
The Information School of the University of Washington INFO-340: Class 2 32 Internal Level How the data is stored –Optimal run-time performance –Optimal space utilization Key concerns: –Storage space for data and indices –Record size and placement –Data compression and encryption
The Information School of the University of Washington INFO-340: Class 2 33 Schemas: Contain information for mapping from one level to the next
The Information School of the University of Washington INFO-340: Class 2 34 Data Independence Logical data independence Changes in the conceptual schema do not cause the external schemas to ‘break’ (If they fail, they fail gracefully) Physical data independence Changes to the internal schema do not cause the conceptual schema to ‘break’
The Information School of the University of Washington INFO-340: Class 2 35 Class Exercise 1.Working in teams of 3-4, select an example database application and sketch a picture of: –External schema –Conceptual schema –Internal schema 2.Give an example of data independence and data dependence
The Information School of the University of Washington Functions of DBMS (See Chapter #2)
The Information School of the University of Washington INFO-340: Class 2 37 Functions of DBMS 1.Data storage, retrieval, and update 2.A user-accessible catalog 3.Transaction support 4.Concurrency control 5.Recovery services 6.Authorization services 7.Support for data communication 8.Integrity services
The Information School of the University of Washington INFO-340: Class 2 38 Summary Evolution of Database Systems –File-based –1 st – 3 rd generation systems Three-level ANSI-SPARC Architecture Functions of a DBMS
The Information School of the University of Washington INFO-340: Class 2 39 Data storage, retrieval, and update Ability to store, retrieve and update data Key idea: Hide internal representation of how this is achieved
The Information School of the University of Washington INFO-340: Class 2 40 A user-accessible catalog Provide users with a catalog that complete describes the database –Tables and relationships –Names, types and sizes of data items –Etc. Purposes: –“Self revealing” for understanding data –Data integrity and security is enforced –Store auditing information
The Information School of the University of Washington INFO-340: Class 2 41 Transaction support A transaction is a series of actions –Example: Staff member quits 1.Delete staff member from database 2.Re-assign responsibilities to another staff member Issue: Must avoid putting the database into an inconsistent state Thus: All steps of a transaction are completed or none are completed
The Information School of the University of Washington INFO-340: Class 2 42 Concurrency control Ensuring the multiple users do not conflict with each other and put the database into an inconsistent state Easy for read-only situations Hard when multiple users can read and write See lost-update problem
The Information School of the University of Washington INFO-340: Class 2 43 Recovery services Databases ‘crash’ –Power goes out –Disks and CPUs fail –Intruders cause systems to fail –Etc. Provide a method for recovering the database and returning it to a consistent state
The Information School of the University of Washington INFO-340: Class 2 44 Authorization services Depending on job role, have access to different information and operations –Querying data –Changing data –Deleting data –Adding data Must be able to give ‘access permissions’ to people
The Information School of the University of Washington INFO-340: Class 2 45 Support for data communication Ability to access central databases from remote client locations –This idea, of course, ‘powers the web’ Databases must handle requests and responses
The Information School of the University of Washington INFO-340: Class 2 46 Integrity services Rules that specify the valid states of the data within the data base Examples –Every employee must have a manager –Managers supervise a max of 10 employees
The Information School of the University of Washington INFO-340: Class GLs High-level applications that are ‘closer’ to users goals Example types (e.g., Access): –Form generators –Report generators –Graphics generators –Application generators