Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI N201: Programming Concepts Copyright ©2005  Department of Computer & Information Science Introducing Databases.

Similar presentations


Presentation on theme: "CSCI N201: Programming Concepts Copyright ©2005  Department of Computer & Information Science Introducing Databases."— Presentation transcript:

1 CSCI N201: Programming Concepts Copyright ©2005  Department of Computer & Information Science Introducing Databases

2 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Goals By the end of this unit, you should understand … … what a database is.… what a database is. … what components comprise a database.… what components comprise a database. … what a Database Management System is.… what a Database Management System is. … the difference among the different types of database structures.… the difference among the different types of database structures. … generally, how database administrators construct databases.… generally, how database administrators construct databases.

3 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science So, what is a database? –Grocery List –Audio CD Catalog –Phone Book –Airline Ticketing Software –Tax Preparation Software –Oncourse –Google –MapQuest –Amazon –eBay In a general sense, a database is any organized collection of data.In a general sense, a database is any organized collection of data. Examples:Examples:

4 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Databases in the Digital World When we think of applications we commonly use, we often think of word processors as tools for solving projects that require us to write; we think of spreadsheets as tools to help us solve problems dealing with numbers (statistics, averages, etc.)When we think of applications we commonly use, we often think of word processors as tools for solving projects that require us to write; we think of spreadsheets as tools to help us solve problems dealing with numbers (statistics, averages, etc.) Whereas spreadsheets are good at answering questions involving numbers ("What is the average … ?"), databases are good at solving other types of questions ("Are there any compact discs available by … ?").Whereas spreadsheets are good at answering questions involving numbers ("What is the average … ?"), databases are good at solving other types of questions ("Are there any compact discs available by … ?").

5 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Databases in the Digital World (continued) Word processors process text.Word processors process text. Spreadsheets process number data.Spreadsheets process number data. Databases process data.Databases process data. (from geekgirl's plain-english computing)

6 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Data vs. Information For the user of a database, the end goal is to view meaningful information.For the user of a database, the end goal is to view meaningful information. Raw data, the values we store in a database, by themselves are essentially useless. For instance, do we know what the value 85215 means? Is it a zip code? Is it a student ID number? Is it a code for a billing application? We don't know … (Hernandez)Raw data, the values we store in a database, by themselves are essentially useless. For instance, do we know what the value 85215 means? Is it a zip code? Is it a student ID number? Is it a code for a billing application? We don't know … (Hernandez)

7 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Data Processing When we process data, we connect sets of data to make meaningful information.When we process data, we connect sets of data to make meaningful information. For instance, if we connect the value 85215 to the value "Tax Preparation – 1040 (Schedule C)", we're probably able to discern that the value 85215 is a code that represents some type of billable service – tax preparation, in this case (Hernandez).For instance, if we connect the value 85215 to the value "Tax Preparation – 1040 (Schedule C)", we're probably able to discern that the value 85215 is a code that represents some type of billable service – tax preparation, in this case (Hernandez). The end result of data processing is meaningful information.The end result of data processing is meaningful information. Data is stored; information is retrieved.Data is stored; information is retrieved.

8 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Types of Modern Databases Operational Databases Used for online transaction processing (OLTP)Used for online transaction processing (OLTP) Dynamic in nature ("just in time" information)Dynamic in nature ("just in time" information) Used heavily by commercial entitiesUsed heavily by commercial entities Analytical Databases Used for online analytical processing (OLAP)Used for online analytical processing (OLAP) Static in natureStatic in nature Often, use OLTPs to populate dataOften, use OLTPs to populate data Used heavily by research entitiesUsed heavily by research entities -from Herenandez

9 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Historical Database Models A database model speaks to how we create a database.A database model speaks to how we create a database. Throughout the years, people have used these models for creating databases:Throughout the years, people have used these models for creating databases: –The Hierarchical Model –The Network Model –The Relational Model (most commonly used today) –The Object-Oriented Model (the future?)

10 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science The Hierarchical Model The hierarchical model connects tables of data via parent/child relationships. In such relations, a parent table can have 1 or more children, but a child table must have 1 and only 1 parent.The hierarchical model connects tables of data via parent/child relationships. In such relations, a parent table can have 1 or more children, but a child table must have 1 and only 1 parent. Tables connect using the physical arrangement of records.Tables connect using the physical arrangement of records. The hierarchical model requires that a user know the structure of the database. Access always starts at the root table.The hierarchical model requires that a user know the structure of the database. Access always starts at the root table.

11 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Hierarchical Model Example - Figure 1.1 from Herenandez

12 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Network Database Model Introduces nodes and sets structures. Nodes are collections of records and set structures are the relationships in the database.Introduces nodes and sets structures. Nodes are collections of records and set structures are the relationships in the database. The relationship between nodes has 1 nodes as the owner node, with 1 or more member nodes. A record in a member node can only be related to only 1 record in an owner node. Records in a member node cannot exist without being related to a record in an owner node.The relationship between nodes has 1 nodes as the owner node, with 1 or more member nodes. A record in a member node can only be related to only 1 record in an owner node. Records in a member node cannot exist without being related to a record in an owner node.

13 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Network Model Example - Figure 1.3 from Herenandez Agents Clients Entertainers Payments Engagements Musical Styles RepresentManage MakeSchedulePerformPlay

14 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Relational Model Derived from two branches of mathematics – set theory & first-order predicate logic.Derived from two branches of mathematics – set theory & first-order predicate logic. Stores data in relations (tables). Each table is composed of tuples (records) and attributes (fields).Stores data in relations (tables). Each table is composed of tuples (records) and attributes (fields). Two features of this model allow us to access data without knowing database structure:Two features of this model allow us to access data without knowing database structure: –The physical structure of the records and fields in a table doesn’t matter. –We identify each individual record in a table by a unique value.

15 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Table Relationships We categorize table relationships in the Relational Model as follows:We categorize table relationships in the Relational Model as follows: –One-to-One (1:1) –One-to-Many (1:N) –Many-to-Many (N:N) To establish a relationship between tables, we need to match values of a shared field.To establish a relationship between tables, we need to match values of a shared field.

16 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Relationship Example Agent ID Agent First Name Agent Last Name Hire Date 100MikeHernandez05/16/95 101GregPiercy10/15/95 102KatherineEhrlich03/01/96 Client ID Agent ID Client First Name Client Last Name 9001100StewartJameson 9002101ShannonMcLain 9003102EstellaPundt - Figure 1.5 from Herenandez

17 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Advantages of Relational Databases Layers of data integrityLayers of data integrity –Table level data integrity: ensures records aren’t duplicated and key values are present –Relationship level data integrity: ensures that the relationship between two tables is valid –Business level: ensures that data is accurate in terms of business rules Data consistency & accuracy – result of built-in data integrity.Data consistency & accuracy – result of built-in data integrity. Independence from physical structureIndependence from physical structure Easy data retrievalEasy data retrieval

18 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Database Management Software Relational database management systems (RDBMS) are applications used to “create, maintain, modify and manipulate” a database.Relational database management systems (RDBMS) are applications used to “create, maintain, modify and manipulate” a database. Typically, RDBMSs include:Typically, RDBMSs include: –Tools to build tables and establish table relationships –Tools for creating forms for user input/output. –Tools for querying a database (asking the database a question) –Tools for creating reports for output.

19 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Phases of Database Design 1.Requirements Analysis – Understanding the information needs of a business client through interviews to understand their current (and future) business environment. 2.Data Modeling – Modeling the database structure using one of the established data- modeling methods, like entity-relationship diagrams; end goal is to visually represent the database structure.

20 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Phases of Database Design (cont.) 3.Data Normalization – Breaking large tables into smaller ones to eliminate redundant data and avoid problems when manipulating data.

21 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Database Tables A database stores data in relations, perceived by the user as tables.A database stores data in relations, perceived by the user as tables. –Comprised of tuples (records) and attributes (fields) –Chief structures in a database –Logical and physical order of fields and records doesn’t matter –Every table must contain a Primary Key Field, which uniquely identifies each of the table’s records. –Tables can represent objects or events.

22 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Types of Tables Data TableData Table –Most common type of table in a relational database –Store data that supplies information –Dynamic in nature Validation Table (Lookup Table)Validation Table (Lookup Table) –Stores data used when enforcing data integrity –Usually static in nature –Examples: job codes, city names, billing categories, etc.

23 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Fields A field, or attribute, is the smallest structure in a database.A field, or attribute, is the smallest structure in a database. Represents a characteristic of the subject of the table to which it belongs.Represents a characteristic of the subject of the table to which it belongs. The quality of information retrieved from the database depends heavily on the time invested in ensuring the structural and data integrity of fields (more on that later …).The quality of information retrieved from the database depends heavily on the time invested in ensuring the structural and data integrity of fields (more on that later …). A field should contain 1 and only 1 distinct value (FirstName or LastName versus FullName, for instance.)A field should contain 1 and only 1 distinct value (FirstName or LastName versus FullName, for instance.)

24 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Records A record, or tuple, is a specific instance of the subject of a table. A record is made up of all fields in a table. Some fields may not have specific values populated.A record, or tuple, is a specific instance of the subject of a table. A record is made up of all fields in a table. Some fields may not have specific values populated. The value stored in the primary key field uniquely identifies the record throughout the database.The value stored in the primary key field uniquely identifies the record throughout the database.

25 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Record & Field Example Student ID Student First Name Student Last Name Student Major 1 40853WilliamHardenPolitical Science 98364MariaGarcia-GrandeNursing 15792MichaelBoberskyPsychology Fields Records Table Name is Students

26 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Views A view, or a virtual table or saved query, is made up of fields from other tables in the database. The contributing tables are called base tables.A view, or a virtual table or saved query, is made up of fields from other tables in the database. The contributing tables are called base tables. Since data is stored in other tables, databases do not store data associated with views (thus eliminating redundancy). Databases only store the structure of the view.Since data is stored in other tables, databases do not store data associated with views (thus eliminating redundancy). Databases only store the structure of the view.

27 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Advantages of Views You can work with data from multiple base tables simultaneously.You can work with data from multiple base tables simultaneously. Security – views prevent restricted users from manipulating data stored in base tables.Security – views prevent restricted users from manipulating data stored in base tables. Views are useful for implementing data integrity (a validation view).Views are useful for implementing data integrity (a validation view).

28 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Primary Keys A primary key is a field or group of fields that uniquely identifies a record. A primary key comprised of two or more fields is called a composite primary key. Every table must have a primary key!A primary key is a field or group of fields that uniquely identifies a record. A primary key comprised of two or more fields is called a composite primary key. Every table must have a primary key! The most important key in a table:The most important key in a table: –Uniquely identifies a specific record throughout a database –Identifies a specific table throughout the database –Enforces table-level integrity –Helps to establish relationships between tables

29 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Foreign Keys A foreign key is important when establishing relationships between tables.A foreign key is important when establishing relationships between tables. To create a foreign key, you would take a primary key from one table and incorporate it in a second table. In the second table, the key becomes a foreign key.To create a foreign key, you would take a primary key from one table and incorporate it in a second table. In the second table, the key becomes a foreign key. Foreign keys enforce relationship-level integrity – values in one table's foreign key field must match exactly with the corresponding values of a second table's primary key field.Foreign keys enforce relationship-level integrity – values in one table's foreign key field must match exactly with the corresponding values of a second table's primary key field.

30 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Example of Primary & Foreign Keys Agent ID Agent First Name Agent Last Name Hire Date 100MikeHernandez05/16/95 101GregPiercy10/15/95 102KatherineEhrlich03/01/96 Client ID Agent ID Client First Name Client Last Name 9001100StewartJameson 9002101ShannonMcLain 9003102EstellaPundt - Adapted from Figure 3.11 from Herenandez AgentsTable ClientsTable Agent ID is the Primary Key in the Agents Table and a Foreign Key in the Clients Table.

31 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Relationships We can build a relationship between tables if we can relate the records in one table with the records in the joining table.We can build a relationship between tables if we can relate the records in one table with the records in the joining table. Two methods for building a relationship:Two methods for building a relationship: –Linking primary and foreign keys –Linking tables via a third table called a linking table or associative table

32 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Importance of Relationships Relationships allow users to establish views based on multiple base tables.Relationships allow users to establish views based on multiple base tables. Relationships help to reduce data redundancy and eliminate duplicate data, thus reinforcing data integrity.Relationships help to reduce data redundancy and eliminate duplicate data, thus reinforcing data integrity.

33 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Categorizing Relationships We categorize relationships between tables in three ways:We categorize relationships between tables in three ways: –The type of relationship between tables –The way that each table in relationship participates in that relationship –The degree of participation that each table participates in a relationship

34 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Different Types of Relationships One-to-One Relationship (1:1)One-to-One Relationship (1:1) One-to-Many Relationship (1:N)One-to-Many Relationship (1:N) Many-to-Many Relationship (N:N)Many-to-Many Relationship (N:N)

35 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science One-To-One Relationships (1:1) A record in one table (a parent table) is related to one and only one record in a second table (a child table). A record in a second table (the child table) is related to one and only one record in the first table (the parent table).A record in one table (a parent table) is related to one and only one record in a second table (a child table). A record in a second table (the child table) is related to one and only one record in the first table (the parent table). We create a 1:1 relationship by copying the primary key of a parent table into a child table, where it becomes a foreign key.We create a 1:1 relationship by copying the primary key of a parent table into a child table, where it becomes a foreign key. This type of relationship is unique because both tables share the same primary key. The primary key in the child table serves both as that table's primary key and a foreign key.This type of relationship is unique because both tables share the same primary key. The primary key in the child table serves both as that table's primary key and a foreign key.

36 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Example of a 1:1 Relationship Employee ID Employee First Name Employee Last Name 100ZacharyErlich 101SusanMcClain 102JoeRosales Employee ID Hourly Rate Commission Rate 10025.005.0% 10119.753.5% 10222.505.0% - Adapted from Figure 3.13 from Herenandez EmployeeTable Compensation Table Employee ID is the Primary Key for both tables and also a Foreign Key in the Compensation Table.

37 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science One-To-Many Relationships (1:N) A record in one table (a parent table) can be related to many records in a second table (a child table). A single record in the child table is related to one and only one record in the parent table.A record in one table (a parent table) can be related to many records in a second table (a child table). A single record in the child table is related to one and only one record in the parent table. We create a 1:N relationship by copying the primary key of a parent table into a child table, where it becomes a foreign key.We create a 1:N relationship by copying the primary key of a parent table into a child table, where it becomes a foreign key. This type of relationship is the most common type of relationship in the relational database model.This type of relationship is the most common type of relationship in the relational database model.

38 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Example of a 1:N Relationship Agent ID Agent First Name Agent Last Name Hire Date 100MikeHernandez05/16/95 101GregPiercy10/15/95 102KatherineEhrlich03/01/96 Client ID Agent ID Client First Name Client Last Name 9001100StewartJameson 9002100ShannonMcLain 9003102EstellaPundt - Adapted from Figure 3.14 from Herenandez AgentsTable ClientsTable Agent ID is the Primary Key in the Agents Table and a Foreign Key in the Clients Table.

39 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Many-To-Many Relationships (N:N) A record in one table can be related to many records in a second table. A single record in the second table can be related to many records in the parent table.A record in one table can be related to many records in a second table. A single record in the second table can be related to many records in the parent table. We cannot inherently create a N:N relationship. Instead, we can resolve a N:N relationship by copying the primary keys of a each table into a third table, called a linking (associative) table. Together, the copied keys form a composite primary key. Individually, they serve as foreign keys for the other table.We cannot inherently create a N:N relationship. Instead, we can resolve a N:N relationship by copying the primary keys of a each table into a third table, called a linking (associative) table. Together, the copied keys form a composite primary key. Individually, they serve as foreign keys for the other table.

40 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Example of Resolving an N:N Relationship

41 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Relationship Participation There are two ways that we categorize relationships based on participation:There are two ways that we categorize relationships based on participation: –Mandatory Participation: If a user MUST enter at least one record into a first table before s/he may enter records in a second, related table. –Optional Participation: If a user MAY enter records in a second table without entering records in the first table.

42 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Degrees of Participation We calculate a table's degree of participation by:We calculate a table's degree of participation by: –The minimum number of records it must associate with a single record in the related table. –The maximum number of records that a related table may associate with a single record in the given table. Think of the degree of participation as the minimum and maximum number of relationships for a single record in a table.Think of the degree of participation as the minimum and maximum number of relationships for a single record in a table.

43 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Example of Degree of Association Assume that for a Department, advisors are assigned at least 1 student and up to 50 students, but no more.Assume that for a Department, advisors are assigned at least 1 student and up to 50 students, but no more. The degree of participation of the Advisor Table would be 1,50. That is, an advisor must be assigned to at least one student in the Student Table, but has a limit of 50 students in the Student Table.The degree of participation of the Advisor Table would be 1,50. That is, an advisor must be assigned to at least one student in the Student Table, but has a limit of 50 students in the Student Table.

44 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Field Specification Field Specification (also called domain) includes all of the elements of a field. There are three types of field elements:Field Specification (also called domain) includes all of the elements of a field. There are three types of field elements: –General Elements: Include all of the basic information about a field, including the field name, the field description and a field's parent table. –Physical Elements: Include information on how the field is constructed and how a user views the field; data type, field length and display format are all physical elements.

45 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Field Specification (continued) –Logical Elements: Describe the values that a field can store, including required values, range of values and default values. Field specification is an important part of database design because it helps to enforce field-level integrity of a database.Field specification is an important part of database design because it helps to enforce field-level integrity of a database.

46 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Data Integrity "Data integrity refers to the validity, consistency, and accuracy of the data in a database." (Hernandez, p. 71)"Data integrity refers to the validity, consistency, and accuracy of the data in a database." (Hernandez, p. 71) Four Types of Data Integrity:Four Types of Data Integrity: –Table-level integrity –Field-level integrity –Relationship-level integrity –Business rules

47 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Table-Level Integrity Also known as entity integrityAlso known as entity integrity Ensures there are no duplicate records throughout a databaseEnsures there are no duplicate records throughout a database Makes sure that primary keys with a table are unique never nullMakes sure that primary keys with a table are unique never null

48 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Field-Level Integrity Also known as domain integrityAlso known as domain integrity Guarantees that that structure of each field is sound:Guarantees that that structure of each field is sound: –Values are "valid, consistent and accurate" (Hernandez, p. 71) –Values of the same type (for instance Academic Major are defined in a consistent manner throughout the database)

49 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Relationship-Level Integrity Also known as referential integrityAlso known as referential integrity Checks to make sure that the relationships between tables are sound.Checks to make sure that the relationships between tables are sound. Also, ensures that records in related tables are synchronized when someone enters data, deletes data or otherwise manipulates it.Also, ensures that records in related tables are synchronized when someone enters data, deletes data or otherwise manipulates it.

50 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Business Rules A database is framed to fit the ways in which an organization runs its business.A database is framed to fit the ways in which an organization runs its business. Business rules may affect several aspects of database design, including:Business rules may affect several aspects of database design, including: –Field ranges and valid values –Types of table relationships –Degree of relationships –Degree of participation –Synchronization of tables

51 CSCI N201: Programming Concepts Copyright ©2004  Department of Computer & Information Science Questions?

52 References geekgirl's plain-english computing (website): http://www.geekgirls.com/menu_databases.htmgeekgirl's plain-english computing (website): http://www.geekgirls.com/menu_databases.htm http://www.geekgirls.com/menu_databases.htm Database Design for Mere Mortals, 2 nd Edition by Michael Hernandez (Addison- Wesley, 2004)Database Design for Mere Mortals, 2 nd Edition by Michael Hernandez (Addison- Wesley, 2004)


Download ppt "CSCI N201: Programming Concepts Copyright ©2005  Department of Computer & Information Science Introducing Databases."

Similar presentations


Ads by Google