Download presentation
Presentation is loading. Please wait.
1
Database Access and Techniques, most notably Data Mining.
2
Tools: Computers and IT. VB, VBA, Excel, InterDev, Etc. Humans: Decision Making Process Algorithms: Math/Flow Chart stuff that helps the tools help the humans make decisions. DSS Data: Facts pertinent to the decision at hand.
3
Warehouses & Marts Data Warehouse –a database designed to support decision-making in an organization. It is batch-updated and structured for fast online queries and exploration. Data warehouses may aggregate enormous amounts of data from many different operational systems. Data Mart –a database focused on addressing the concerns of a specific problem or business unit (e.g. Marketing, Engineering). Size doesn’t define data marts, but they tend to be smaller than data warehouses.
4
Data Warehouses & Data Marts TPS & other operational systems Data Warehouse Data Mart (Marketing) Data Mart (Engineering) 3rd party data = query, OLAP, mining, etc. = operational clients
5
Data Warehousing Physical separation of operational and decision support environments Purpose: to establish a data repository making operational data accessible Transforms operational data to relational form Only data needed for decision support come from the TPS Data are transformed and integrated into a consistent structure Data warehousing (or information warehousing): a solution to the data access problem End users perform ad hoc query, reporting analysis and visualization
6
Data Warehousing Benefits Increase in knowledge worker productivity Supports all decision makers’ data requirements Provide ready access to critical data Insulates operation databases from ad hoc processing Provides high-level summary information Provides drill down capabilities Yields –Improved business knowledge –Competitive advantage –Enhances customer service and satisfaction –Facilitates decision making –Help streamline business processes
7
DW Suitability For organizations where Data are in different systems Information-based approach to management in use Large, diverse customer base Same data have different representations in different systems Highly technical, messy data formats
8
Transform Data from TPS to Warehouse Consolidate data –e.g. from multiple TPS around the country/world “Scrub” the data –keep definitions consistent (e.g. translate part numbers/product names if they differ per country) Calculate fields (decrease processor load) Summarize fields (decrease processor load) De-normalize data (ease of use)
9
OLAP: Data Access and Mining, Querying and Analysis Online Analytical processing (OLAP) –DSS and EIS computing done by end-users in online systems –Versus online transaction processing (OLTP) OLAP Activities –Generating queries –Requesting ad hoc reports –Conducting statistical analyses –Building multimedia applications
10
OLAP uses the data warehouse and a set of tools, usually with multidimensional capabilities Query tools Spreadsheets Data mining tools Data visualization tools
11
Using SQL for Querying SQL (Structured Query Language) Data language English-like, nonprocedural, very user friendly language Free format Example: SELECTName, Salary FROMEmployees WHERESalary >2000
13
Query Tools & OLAP Query Tools –user-lead discovery. Can return individual records or summaries. Requests are formulated in advance (e.g. “show me all delinquent accounts in the northeast region during Q1”). OLAP - Online Analytical Processing –user-lead discovery. Data is explored via “drill down” into the data by selecting variables to summarize on. Results are usually reported in a cross-tab report or graph (e.g. “show me a tabular breakdown of sales by business unit, product type, and year”).
14
OLAP Online Analytical Processing. (example of cross-tab results presented below) 1. business unit 2. product type 3. year
16
Data Mining automated information discovery process, uncovers important patterns in existing data –can use neural networks or other approaches. Requires ‘clean’, reliable, consistent data. Historical data must reflect the current environment. e.g. “What are the characteristics that identify when we are likely to lose a customer?”
17
Data Mining For Knowledge discovery in databases Knowledge extraction Data archeology Data exploration Data pattern processing Data dredging Information harvesting
18
Data Mining Examples Market Segmentation - e.g. Dayton Hudson Direct Marketing - e.g. Chase Market basket analysis - e.g. Wal-Mart Customer Churn - e.g. Fleet Bank Fraud Detection - e.g. Bank of America Cost Reduction Prospecting - e.g. Merk Medco.
19
Major Data Mining Characteristics and Objectives Data are often buried deep Client/server architecture Sophisticated new tools--including advanced visualization tools--help to remove the information “ore” Massaging and synchronizing data Usefulness of “soft” data End-user minor is empowered by “data drills” and other power query tools with little or no programming skills Often involves finding unexpected results Tools are easily combined with spreadsheets etc. Parallel processing for data mining
20
Data Mining Application Areas Marketing Banking Retailing and sales Manufacturing and production Brokerage and securities trading Insurance Computer hardware and software Government and defense Airlines Health care Broadcasting Law Enforcement … almost everywhere!!!
21
Stupid Data-Miner Tricks Ad-Hoc Theories –when an oddity jumps out of the data, it’s tempting to develop a theory for it. Sometimes findings are just statistical flukes. Using Too Many Variables –the more factors considered, the more likely a relationship will be found - valid or not. Not Taking No for an Answer –it’s OK to stop looking if you can’t find anything. There are no silver bullets. Limited or incorrect interpretation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.