IST 210: ORGANIZATION OF DATA Chapter 1. Getting Started IST210 1
Purpose of a Database The purpose of a database is to keep track of things Unlike a list or spreadsheet, a database may store information that is more complicated than a simple list IST210 2
Mini Case You are designing our course selection system What aspects you need to store a record? Student ID, Student Name, Student's Department, CourseID, Instructor, CourseName, Location What questions (i.e. queries) will users ask? Student: What class I have registered for this semester? Instructor: How many students are registered and what are their backgrounds? What tool would you use to manage the data? Excel? IST210 3
Problems with a Simple List IST210 4 Redundancy Multiple Themes
Problems with a Simple List: Redundancy In a list, each row is intended to stand on its own. As a result, the same information may be entered several times A list of class enrollment may include Student ID, Student Name, Class, Instructor Name, Location, Lecture time, … If there are 40 students taking IST210, class information will be entered 40 times. IST210 5
Problems with a Simple List: Multiple Themes In a list, each row may contain information on more than one theme. As a result, needed information may appear in the lists only if information on other themes is also present For Example: A list of class registration may include Student Information (ID, Name, Department) and Course Information (ID, Instructor, Location). IST |Dashun |Organization of Data |208IST
List Modification Issues Redundancy and multiple themes create modification problems Deletion problems Update problems Insertion problems IST210 7
List Modification Issues: Insert IST210 8 Insert: A new student not taking any class
IST210 9 Insert: A new student not taking any class Problem: blank cells for course information
List Modification Issues: Update IST Update: IST210 location changed
IST Update: IST210 location changed Problem: Need to update multiple rows
List Modification Issues: Delete IST Delete: Kate drops 230
IST Delete: Kate drops 230 Problem: Information about Kate and about course 230 will be lost!
A Long List to Several Small Lists IST Two themes: Student, Course INFORMATION LOSS! Registration information is not in Student and Course tables
A Long List to Several Small Lists IST PROBLEMS! One cell does NOT allow multiple values. (IMPORTANT! This rule is strictly enforced in database.) Two themes: Student, Course
A Long List to Several Small Lists IST Student Entity Course Entity Student-Course Relationship Three themes: two entities and one relationship
A Long List to Several Small Lists IST Student Course Registration Key points in splitting: 1. A table must be connected with other table(s) through shared column(s) Student (StudentID) Registration Course (CourseID) Registration 2. One cell can only have one value Revisit previous issues: Insert: A new student not taking any class Update: IST210 location changed Delete: Kate drops 230 Use above criteria to check whether you split the tables correctly!
In-Class Exercise IST What are the problems with this table. Split it into multiple tables. Check whether you split the table correctly.
Relational Databases A relational database stores information in tables. Each informational topic is stored in its own table. In essence, a relational database will break-up a list into several parts. One part for each theme in the list A well-formed relational database: a criteria to determine whether a database is good enough (no redundancy, no modification issues) We will learn in Chapter 2 IST210 19
Answer Query: Putting the Pieces Back Together In our relational database, we broke apart our list into several tables. Somehow the tables must be joined back together In a relational database, to answer a query, tables are joined together using the value of the data IST210 20
Query Relational Database: Using One Table IST Student Table Course Table Registration Table Query 1: How many students take class 210? Answer: Check Registration Table to see many rows with CourseID as 210. count = 4
Query Relational Database: Using Two Tables IST Query 2: How many students take class taught by John? Student Table Course Table Registration Table Answer: Step 1. Check the CourseID taught by John in Course Table. CourseID = 220 Step 2. See how many students taking class with CourseID 220 in Registration Table. count = 2
Query Relational Database: Using Three Tables IST Query 3: Who are the students taking class taught by Jessie? Student Table Course Table Registration Table Answer: Step 1. Check the CourseID taught by Jessie in Course Table. CourseID = 210 Step 2. Get the StudentID taking class with CourseID 210 in Registration Table. StudentID 1, 5, 2, 3 Step 3. Get the student names in Student Table with StudentID 1,5,2,3. Bob, Lisa, Sarah, Jim
Query Relational Database IST Student Table Course Table Registration Table In a relational database, to answer a query, tables are joined together using the value of the data in the shared columns
Query Relational Database: Structured Query Language (SQL) Structured Query Language (SQL) is an international standard for creating, processing and querying databases and their tables IST SELECTCount(StudentID) FROMCourse, Registration WHERECourse.CourseID = Registration.CourseID AND Course.Instructor = ‘John’ Query 2: How many students take class taught by John? Answer: Step 1. Check the CourseID taught by David in Course Table. CourseID = 220 Step 2. See how many students taking class with CourseID 220 in Registration Table. count = 3
Exercise (cont.) IST After you split this table into multiple tables, answer following questions: Question 1 How many items purchased by Anderson? Question 2 What items purchased by customers in State College?
Sounds like More Work, Not Less A relational database is more complicated than a list However, a relational database minimizes data redundancy, preserves complex relationships among topics, and allows for partial data Furthermore, a relational database provides a solid foundation for user forms and reports IST210 27
Example in Textbook For your reference… IST210 28
Relational Database Example IST210 29
A Relational Database Solves the Problems of Lists IST210 30
The Department, Advisor and Student Tables IST210 31
The Project Equipment Tables IST210 32
Key Points in This Chapter What is the problem with a simple list to store the information? Redundancy, Modification issues What is the solution to replace a simple list? Relational database Break a simple long list to several tables; each table has its own theme How to query a relational database? Join back the tables by the value of data through shared columns IST210 33
Next I know splitting a simple list to multiple tables will reduce redundancy and avoid modification issues, but How should we split a simple list? Is there any rule we could follow to split the list? Is there any criteria to know the tables are good enough? We will answer this question in Chapter 2. IST210 34
QUESTION? Reminder: No Labs for this week! Fill out programming skill survey on Angel! IST210 35