Download presentation
Presentation is loading. Please wait.
Published byEmery Ross Modified over 9 years ago
1
Databases and Statistical Databases Session 4 Mark Viney Australian Bureau of Statistics 5 June 2007
2
Terms Database ƒ A shared collection of logically related data (and description of this data), designed to meet the information needs of an organisation DataBase Management System (DBMS) ƒ A software system that enables users to define, create and maintain the database and provides controlled access to this database
3
Terms (example) Database ƒ Personnel Database ƒ Stock database ƒ Statistical Database DataBase Management System (DBMS) ƒ Oracle ƒ DB2 ƒ Access ƒ MySql ƒ FoxPro ƒ Firebird
4
Why keep information in databases? Accessibility of data ƒ Increased concurrency (reads and writes) ƒ Sharing data Improved data integrity Improved security ƒ access only to necessary data Relatable ƒ More information from same amount of data Visible
5
Why keep information in databases? (continued) backup and recovery Improved productivity ƒ common tools / common processes
6
Disadvantages of databases Complexity Size Cost of DBMS Need to upgrade versions Additional hardware costs Higher impact of failure www.cableready.net/newsletter/winter99.html
7
Databases used to be solely mainframe commonly on minicomputers increasingly available on microcomputers mostly accessed by SQL
8
Relational Databases entities ƒ datatypes ƒ validation relationships ƒ rules for interaction
9
Database Tables rows and columns fixed number of columns multiple rows (records) columns are of same datatype
10
Structured Query Language - SQL Standard database language that allows:- ƒ Database creation and relation structures ƒ Basic data management tasks ƒ Both simple and complex queries
11
SQL - Data Definition - DDL allows creation, modification and deletion of database objects ƒ Creation - CREATE CREATE TABLE TAB1 (COL1 NUMBER, COL2 NUMBER); ƒ Modification - ALTER ALTER TABLE TAB1 ADD COL3 NUMBER; ƒ Deletion - DROP DROP TABLE TAB1;
12
Structured Query Language - SQL Data Manipulation - DML Standard language to allow access the data stored in databases ƒ Extraction - SELECT SELECT COL1,COL2 from TAB1; ƒ Loading - INSERT INSERT INTO TAB1 (COL1,COL2) VALUES(7,22); ƒ Manipulation - UPDATE UPDATE TAB1 SET COL2 = COL1 + 2; ƒ Deletion - DELETE DELETE FROM TAB1 WHERE COL2 = 4;
13
Database Modeling representation of "real world" conceptual model logical model physical model
14
Keys Primary Keys ƒ uniquely identifies a record Foreign Keys ƒ pointer to a Primary Key in another table
15
Indexes May be applied to columns to allow fast data access May be applied to single columns or several columns Direct pointers to rows containing specific values in the indexed column(s) may be unique or non-unique May have more than one index per table
16
Normalisation A technique for producing a set of relations with desirable properties, given the data requirements of an enterprise
17
Normalisation - unnormalised A representation of the data that contains repeating groups
18
Normalization - unnormalised form
19
Normalisation - 1st normal form A relation in which the intersection of each row and column contains one and only one value 1NF
20
Normalization - 1st normal form 1NF
21
Normalisation - 2nd normal form A relation that is ƒ in first normal form ƒ every non-primary key attribute is fully functionally dependent on the primary key 2NF
22
Normalization - 2nd normal form 2NF
23
Normalisation - 3rd normal form A relation that is ƒ in first and second normal form and ƒ in which no non-primary key attribute is transitively dependent on the primary key 3NF
24
Normalization - 3rd normal form 3NF
25
Loading data into databases Bulk loading tool Data Integrity Validation ad-hoc loading
26
Data Extraction Assemble data into usable format Spreadsheet Timeseries Data Cube Publication
27
Data manipulation Inside database ƒ Sophisticated manipulation language - SQL Outside database ƒ Timeseries Seasonal Adjustment Chain Volume Measures (Constant Price) ƒ SAS, SPSS
28
Transactional Integrity the ability to apply rules to the data via database constraints ability to group several discrete data insertion or data manipulation into one logical data change In SQL, controlled via COMMIT and ROLLBACK statements
29
Transactional Integrity database constraints ƒ values must conform to specific rules exist in a specific column belong to a "set" uniqueness If a validation against a constraint fails ƒ the current transaction fails
30
Transactions & Recovery Each transaction is logged by the DBMS Backups taken periodically Data can be recovered ƒ to an archived backup ƒ to a point in time
31
COMMIT; INSERT INTO TABLE1 (COL1,COL2) VALUES(7,22); UPDATE TABLE1 SET COL1 = 77 WHERE COL2 = 22; DELETE FROM TABLE1 WHERE COL1 = 7; ROLLBACK; INSERT INTO TABLE2 (COL3,COL4) VALUES('ABC',11); UPDATE TABLE2 SET COL3 = 'XYZ'; DELETE FROM TABLE2 WHERE COL3 = 'xyz'; COMMIT; Transaction example transaction 1 transaction 2
32
Database Systems a Practical Approach to Design, Implementation and Management Thomas Connolly, Carolyn Begg, Anne Strachan (Addison-Wesley) 1999 cartoons - Randy Glassbergen References
33
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.