Download presentation
Presentation is loading. Please wait.
Published byViolet Daniels Modified over 9 years ago
1
Chapter 10 Data and Knowledge Management
2
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
3
Data Set of discrete, objective facts about events Business - structured records of transactions Little relevance or purpose
4
Information Message with sender and receiver Meant to change way receiver perceives something Have an impact on his judgment / behavior
5
Data Processing Contextualize - why was data gathered? Categorize - what are its key components? Calculate - analyze mathematically Condense - summarize in more concise form
6
Information Processing Compare - in kind and in time Consequences - how used in decisions / actions Connections - relation to other information Conversation - what other people think about this information
7
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
8
Database Element Types Structure Models Creation Topology
9
Element Bit, byte, field, record, file, database Entity, attribute, key field Relation Class, object
10
Database Types Business database Geographical information database Knowledge database / deductive database Multimedia database Data warehouse Data marts Multimedia and hypermedia database Object-oriented database
11
Database Structure Data definition language –Schema & subschema Data Manipulation language –Structured Query Language (SQL) –Query By Example (QBE) Data dictionary
12
Database Models Hierarchical –One to many –TPS or routine MIS Network –Many to many –TPS or routine MIS Relational –Normalization –Ad hoc reports or DSS Object-oriented –E-commerce
13
Database Creation Conceptual design –Logical view –Entity-relationship (ER) diagram –Normalization
14
Entity Relationship Diagram Entity: object or concept Relationship: meaning association between objects Attribute: property of an object –Simple & Composite –Single-valued & multi-valued –Derived Key –Primary key –Foreign key
15
Normalization A technique for identifying a true primary key for a relation Types –First normal form: not repeating group –Second normal form: every non-primary-key attribute is fully functionally dependent on the entire primary key –Third normal form: no transit dependency
16
Structured Query Language Select Join
17
SQL DML - SELECT SELECT [DISTINCT|ALL] {* | [col- expr [AS newname]][,...] FROM table-name [alias] [,...] [WHERE condition] [GROUP by colm [, colm] [HAVING condition]] ORDER BY colm [, colm]
18
SQL DML - SELECT SELECT attributes (or calculations: +, -, /, *) FROM relation SELECT DISTINCT attributes FROM relation
19
Examples SELECT stuname FROM student; SELECT stuid, stuname, credit FROM student; SELECT stuid, stuname, credit+10 FROM student; SELECT DISTINCT major FROM student;
20
SQL DML - SELECT SELECT attributes (or * wild card) FROM relation WHERE condition
21
Examples SELECT * FROM student; SELECT stuname, major, credit FROM student WHERE stuid = ‘S114’; SELECT * FROM faculty WHERE dept = ‘MIS’;
22
SELECT - WHERE condition AND OR NOT IN NOT INBETWEEN IS NULLIS NOT NULL LIKE '%' multiple characters LIKE ‘_’ single characters
23
Examples SELECT * FROM faculty WHERE dept = ‘MIS’ AND rank = ‘full professor’; SELECT * FROM faculty WHERE dept = ‘MIS’ OR rank = ‘full professor’; SELECT * FROM faculty WHERE dept = ‘MIS’ NOT rank = ‘full professor’;
24
SELECT * FROM class WHERE room LIKE ‘B_S%’; SELECT * FROM class WHERE room NOT LIKE ‘BUS%’; SELECT productid, productname FROM inventory WHERE onhand BETWEEN 50 and 100;
25
SELECT companyid, companyname FROM company WHERE companyname BETWEEN ‘G’ AND ‘K’; SELECT productid, productname FROM inventory WHERE onhand NOT BETWEEN 50 and 100; SELECT companyid, companyname FROM company WHERE companyname NOT BETWEEN ‘G’ AND ‘K’;
26
SELECT facname FROM faculty WHERE dept IN (‘MIS’, ‘ACT’); SELECT facname FROM faculty WHERE rank NOT IN (‘assistant’, ‘lecture’); SELECT customername FROM customer WHERE emailadd IS NOT NULL;
27
SELECT customername FROM customer WHERE creditlimit IS NULL;
28
SELECT - aggregate functions COUNT (*) COUNT SUM AVG MIN MAX
29
Examples SELECT COUNT(*) FROM student; SELECT COUNT(major) FROM student; SELECT COUNT(DISTINCT major) FROM student;
30
SELECT COUNT(stuid), SUM(credit), AVG(credit), MAX(credit), MIN(credit) FROM student;
31
SELECT - GROUP GROUP BY HAVING
32
Examples SELECT major, AVG(credit) FROM student GROUP BY major HAVING COUNT(*) > 2; SELECT course#, COUNT(stuid) FROM enrollment GROUP BY course# HAVING COUNT(*) > 2;
33
SELECT major, AVG(credit) FROM student WHERE major IN (‘MIS’, ‘ACT’) GROUP BY major HAVING COUNT(*) > 2;
34
SELECT - ORDER BY ORDER BY ORDER BY... DESC
35
Examples SELECT facname, rank FROM faculty ORDER BY facname; SELECT facname, rank FROM faculty ORDER BY rank DESC, facname;
36
SELECT - JOIN Tables Multiple tables in FROM clause MUST have join conditions!!!
37
Examples SELECT stuname, grade FROM student, enrollment WHERE student.stuid = enrollment.stuid;
38
SELECT enrollment.course#, stuname, major FROM class, enrollment, student WHERE class.course# = enrollment.course# AND enrollment.stuid = student.stuid AND facid = ‘F114’ ORDER BY enrollment.course#;
39
SUBQUERY, EXIST, NOT EXIST SELECT s.stuname, major FROM student s WHERE EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);
40
SELECT s.stuname, major FROM student s WHERE NOT EXIST (SELECT * FROM enrollment e WHERE s.stuid = e.stuid);
41
Database Creation Physical design –Physical view –Data topology (organization) Centralized Distributed database –Replicated database –Partitioned Organization & access method –Sequential file –Indexed sequential file –Direct or random file Security –Logical, physical, and transmitting
42
Selection Criteria User’ needs (type of application) Compatibility Portability Reliability Cost Feature Performance Vendor’s support Others?
43
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
44
Data Administrator Clean up data definitions Control shared data Manage distributed data Maintain data quality
45
Clean Up Definitions Synonyms / aliases Standard data definitions –Names and formats Data Dictionary –Active –Integrated
46
Control Shared Data Local - used by one unit Shared - used by two or more activities Impact of proposed program changes on shared data Program-to-data element matrix
47
Manage Distributed Data Geographically dispersed –Whether shared data or not Different levels of detail –Different management levels
48
Maintain Data Quality Put owners in charge of data –Verify data accuracy and quality Purge old data
49
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
50
The DBMS Data Base Management System: software that permits a firm to: –Centralize data –Manage them efficiently –Provide access to applications Such as payroll, inventory
51
DBMS Components Data Definition Language (DDL) Data Manipulation Language (DML) Inquiry Language (IQL) Teleprocessing Interface (TP)
52
Definitions Views: –Physical - how stored –Logical - how viewed and used by users Schema - Overall logical layout of records and fields in a database Subschema: Individual user’s logical portion of database (view)
53
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
54
Distributing Data Centralized files Fragemented files –Distribute data without duplication –Users unaware of where data located
55
Distributing Data Replicated files –Data duplicated –One site has master file –Problem with data synchronization Decentralized files –Local data autonomy
56
Distributing Data Distributed files –Client / server systems –Stored centrally –Portion downloaded to workstation –Workstation can change data –Changes uploaded to central computer
57
Agenda Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
58
Data Warehousing Collect large amounts of data from multiple sources over several years Classify each record into multiple categories –Age –Location –Gender
59
Data Warehousing Rapidly select and retrieve by multiple dimensions –All females in Chicago under 25 years of age Provide tailored, on-demand reports Data mart: a replicated subset of the data warehouse –A functional or regional area
60
Data Mining Fitting models to, or determining patterns from, warehoused data Purposes: –Analyze large amount of data –Find critical points of knowledge –Perform automatic analyses
61
Data Mining Terms Data Visualization Drill-down Analysis –Hierarchical structure –Leads to increasing level of detail Expert System (ES) methodology –e.g., neural networks
62
Applications Finance - fraud detection Stock Market - forecasting Real estate - property evaluation Airlines - customer retention Retail - customer targeting
63
Data Mining Example What type customers are buying specific products? When are the times customers will most likely shop? What types of products can be sold together?
64
Points to Remember Information processing Database Data Administrator The DBMS Distributing data Data warehousing and data mining
65
Discussion Questions How can a database help an organization? Why normalization is very important for building a database? Do you see any problem of the database in your organization?
66
Discussion Questions What kind of database model is most suitable for –School? –Department store? –Police? Some organizations are hesitant to distribute data. These organizations feel that they may lose control. –Do they lose control? Why? –Could you suggest a “good” tactic? Could Data Mining pose a threat to individual privacy? –Why or why not? –If so, how can we mitigate that threat? –Do the advantages outweigh the disadvantages?
67
Assignment Review chapters 10 Read chapter 8, 9, and 11 Group assignment Research paper
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.