Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 1 Slide Introduction to Data Mining and Business Intelligence.

Similar presentations


Presentation on theme: "1 1 Slide Introduction to Data Mining and Business Intelligence."— Presentation transcript:

1 1 1 Slide Introduction to Data Mining and Business Intelligence

2 2 2 Slide Why Mine Data? Commercial Viewpoint n Lots of data is being collected and warehoused Web data, e-commerce Web data, e-commerce purchases at department/ grocery stores purchases at department/ grocery stores Bank/Credit Card transactions Bank/Credit Card transactions n Computers have become cheaper and more powerful n Competitive Pressure is Strong Provide better, customized services for an edge (e.g. in Customer Relationship Management) Provide better, customized services for an edge (e.g. in Customer Relationship Management)

3 3 3 Slide Why Mine Data? Scientific Viewpoint n Data collected and stored at enormous speeds (GB/hour) remote sensors on a satellite remote sensors on a satellite telescopes scanning the skies telescopes scanning the skies microarrays generating gene expression data microarrays generating gene expression data scientific simulations generating terabytes of data scientific simulations generating terabytes of data n Traditional techniques infeasible for raw data n Data mining may help scientists in classifying and segmenting data in classifying and segmenting data in Hypothesis Formation in Hypothesis Formation

4 4 4 Slide Mining Large Data Sets - Motivation n There is often information “hidden” in the data that is not readily evident n Human analysts may take weeks to discover useful information n Much of the data is never analyzed at all The Data Gap Total new disk (TB) since 1995 Number of analysts From: R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications” From: R. Grossman, C. Kamath, V. Kumar, “Data Mining for Scientific and Engineering Applications”

5 5 5 Slide What is business intelligence?

6 6 6 Slide BUSINESS INTELLIGENCE n Business intelligence (BI) – applications and technologies used to gather, provide access to, and analyze data and information to support decision-making efforts

7 7 7 Slide The Problem: Data Rich, Information Poor n Businesses face a data explosion as digital images, email in-boxes, and broadband connections doubles by 2010 n The amount of data generated is doubling every year n Some believe it will soon double monthly

8 8 8 Slide The Solution: Business Intelligence n Improving the quality of business decisions has a direct impact on costs and revenue n BI systems and tools results in creating an agile intelligent enterprise

9 9 9 Slide The Solution: Business Intelligence n BI enables business users to receive data for analysis that is: Reliable Reliable Consistent Consistent Understandable Understandable Easily manipulated Easily manipulated

10 10 Slide The Solution: Business Intelligence n BI can answer tough customer questions

11 11 Slide What is data mining?

12 12 Slide DATA MINING n Data mining (knowledge discovery from data) Extraction of interesting (non- trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data Extraction of interesting (non- trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data

13 13 Slide What is Data Mining? n Many Definitions Non-trivial extraction of implicit, previously unknown and potentially useful information from data Non-trivial extraction of implicit, previously unknown and potentially useful information from data Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns

14 14 Slide What is (not) Data Mining? What is Data Mining? What is Data Mining? – Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area) – Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,) What is not Data Mining? What is not Data Mining? – Look up phone number in phone directory – Query a Web search engine for information about “Amazon”

15 15 Slide DATA MINING n Data-mining tools – use a variety of techniques to find patterns and relationships in large volumes of information Clustering Clustering Classification Classification Affinity grouping (Association Detection) Affinity grouping (Association Detection) Statistical Estimation and Prediction Statistical Estimation and Prediction

16 16 Slide Cluster Analysis n Cluster analysis – a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible n CRM systems depend on cluster analysis to segment customer information and identify behavioral traits

17 17 Slide Cluster Analysis

18 18 Slide Classification n Classification – finds a model to categorize input information into several pre-defined groups. n E.g. classification of credit card approval applications, classification of documents, etc.

19 19 Slide Association Detection n Association detection – reveals the degree to which variables are related and the nature and frequency of these relationships in the information Market basket analysis Market basket analysis E.g. beer and diapers were often purchased together  move them closerE.g. beer and diapers were often purchased together  move them closer

20 20 Slide Statistical Analysis n Statistical analysis – performs such functions as information correlations, distributions, calculations, and variance analysis Forecast – predictions made on the basis of time- series information Forecast – predictions made on the basis of time- series information Time-series information – time-stamped information collected at a particular frequency Time-series information – time-stamped information collected at a particular frequency


Download ppt "1 1 Slide Introduction to Data Mining and Business Intelligence."

Similar presentations


Ads by Google