Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr.

Similar presentations


Presentation on theme: "Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr."— Presentation transcript:

1

2 Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standard Chartered

3 Knowledge Discovery Centre: CityU-SAS Partnership 2 The Art and Science of Data Mining Y V Hui City University of Hong Kong

4 Knowledge Discovery Centre: CityU-SAS Partnership 3 The Driving Forces Specialization and focus in business - To satisfy the needs of customers - To improve and develop specific business strategies and processes - Personalization through mass customization

5 Knowledge Discovery Centre: CityU-SAS Partnership 4 The Driving Forces Challenges - local and global competition - distributed business operations - product innovation Technology development Benefit, cost and risk on a product or customer basis

6 Knowledge Discovery Centre: CityU-SAS Partnership 5 Data Mining Also known as knowledge discovery in databases. Data mining digs out valuable information from large and messy data. (Computer scientist’s definition) Data mining is a knowledge discovery process. It’s the integration of business knowledge, people, information, statistics and computing technology.

7 Knowledge Discovery Centre: CityU-SAS Partnership 6 Data Mining is Hot Ten Hottest Job, Time, 22 May, 2000 10 emerging areas of technology, MIT’s Magazine of Technology Review, Jan/Feb, 2001

8 Knowledge Discovery Centre: CityU-SAS Partnership 7 Data Mining Philosophy A powerful enabler of competitive advantage. Data mining is driven from business knowledge. Data mining is about enabling people to discover actionable information about their business. Return of profit isn’t about algorithms

9 Knowledge Discovery Centre: CityU-SAS Partnership 8 Business outlook Industry conditions Product offering Customer analysis Strategic options Competitive actions etc Problem development and management Reporting and evaluations Project design Data collection and preparation Model building Validation Management’s Decision World Interface Data Miner’s Analytical World Scope of Data Mining

10 Knowledge Discovery Centre: CityU-SAS Partnership 9 Project Management Cross-functional team System architecture

11 Knowledge Discovery Centre: CityU-SAS Partnership 10 Successful applications Business transaction - risks and opportunities Customer relationship management - personalization, target marketing Electronic commerce & web - web mining

12 Knowledge Discovery Centre: CityU-SAS Partnership 11 Successful applications Science & engineering Health care Multi-media Others

13 Knowledge Discovery Centre: CityU-SAS Partnership 12 Data Mining Process Understanding of business Problem identification

14 Knowledge Discovery Centre: CityU-SAS Partnership 13 Understanding Your Business Do we have a problem? - What is the current situation? Are there any undesirable situations that need attention? - Are there any conditions, processes, etc, that could be improved? - Are any problems foreseeable that could affect the business? - Are there any potential opportunities that the company may capitalize on? A problem is a learning opportunity

15 Knowledge Discovery Centre: CityU-SAS Partnership 14 Understanding Your Problem Operational or analytical Convention rule or knowledge discovery Product based or customer based Market research or data mining Ownership of the information Privacy Added value

16 Knowledge Discovery Centre: CityU-SAS Partnership 15 Data Mining Process Collecting relevant information Understanding of business Problem identification

17 Knowledge Discovery Centre: CityU-SAS Partnership 16 Collecting Relevant Information Data Search Data Collection Data Preparation Data Mining Database

18 Knowledge Discovery Centre: CityU-SAS Partnership 17 Data Search Exploring the problem space. Don’t let the data drive the problem. Measurement Exploring the data sources

19 Knowledge Discovery Centre: CityU-SAS Partnership 18 Data Collection Data retrieval Data audit Data set assembly and data warehouse Survey

20 Knowledge Discovery Centre: CityU-SAS Partnership 19 Data Preparation Data representation Data exploration Data normalization Data transformation Imputation of missing data Data tuning

21 Knowledge Discovery Centre: CityU-SAS Partnership 20 Data Mining Database Variable selection Record selection Data set partition

22 Knowledge Discovery Centre: CityU-SAS Partnership 21 Data Mining Process Collecting relevant informationModel building Understanding of business Problem identification Learning

23 Knowledge Discovery Centre: CityU-SAS Partnership 22 Model Building Model based vs non-model based y 1,y 2,…,y p =f(x 1, …, x q ) x 1, …, x q y 1, …, y p InputsOutputs

24 Knowledge Discovery Centre: CityU-SAS Partnership 23 Model Building Parametric vs nonparametric

25 Knowledge Discovery Centre: CityU-SAS Partnership 24 Model Building Estimation vs trial and error Directed vs undirected Multidimensional analysis Large data set vs small data set

26 Knowledge Discovery Centre: CityU-SAS Partnership 25 Data Mining Algorithms Online Analytical Processing Discovery Driven Methods SQL Query Tools DescriptionPrediction ClassificationRegressions Decision Trees Neural Networks Visualization Clustering Association Sequential Analysis

27 Knowledge Discovery Centre: CityU-SAS Partnership 26 Online Analytical Processing Query and reporting Example of SQL query: How many credit-card customers who made purchases of over $1,000 on sporting goods in December have at least $20,000 of available credit? Manual and validation driven

28 Knowledge Discovery Centre: CityU-SAS Partnership 27 Estimation and Prediction Statistical models Neural network Example: Housing price valuation model

29 Knowledge Discovery Centre: CityU-SAS Partnership 28 Classification Algorithms Statistical techniques Neural networks Genetic algorithms Nearest neighbor method Rule induction and decision tree Example: Customer segmentation and buying behavior description

30 Knowledge Discovery Centre: CityU-SAS Partnership 29 Association Rules Apriori algorithm Example: Market basket analysis, cross selling analysis

31 Knowledge Discovery Centre: CityU-SAS Partnership 30 Sequential Analysis Count-all algorithm Count-some algorithm Example: Attached mailing, add-on sales

32 Knowledge Discovery Centre: CityU-SAS Partnership 31 Algorithms Comparison No single data mining algorithm can outperform any other. Try different algorithms and draw conclusions from the results. Use your business knowledge. Neural networks do no better than statistical models when the underlying structure is known. However, neural networks detect hidden interactions and nonlinearity. Use the prior information if available.

33 Knowledge Discovery Centre: CityU-SAS Partnership 32 Algorithms Comparison Data mining algorithms cannot handle dependent records. Use the prior information. Statistical models help. Data tuning and dimension reduction enhance data mining before and after the analysis. Statistical techniques help.

34 Knowledge Discovery Centre: CityU-SAS Partnership 33 Data Mining Process Collecting relevant dataModel building Understanding of business Problem identification Business strategy and evaluation Learning Action

35 Knowledge Discovery Centre: CityU-SAS Partnership 34 Trends that Effect Data Mining Data trends - data explosion - data types

36 Knowledge Discovery Centre: CityU-SAS Partnership 35 Trends that Effect Data Mining Hardware trends - memory - processing speed - storage

37 Knowledge Discovery Centre: CityU-SAS Partnership 36 Trends that Effect Data Mining Network trends - network connectivity - distributed databases Wireless communication

38 Knowledge Discovery Centre: CityU-SAS Partnership 37 Trends that Effect Data Mining Scientific computing trends - theory, experiment and simulation

39 Knowledge Discovery Centre: CityU-SAS Partnership 38 Trends that Effect Data Mining Business trends - total quality management, - customer relationship management, - business process reengineering, - enterprise resources planning, - supply chain management, - business intelligence and knowledge management, - e – business and m – business

40 Knowledge Discovery Centre: CityU-SAS Partnership 39 Trends that Effect Data Mining Privacy and Security

41 Knowledge Discovery Centre: CityU-SAS Partnership 40 Pot of Gold The benefits of knowing one’s business and customers become so critical that technologies are coming together to support data mining. Data mining is not a cybernetic magic that will turn your data into gold. It’s the process and result of knowledge production, knowledge discovery and knowledge management.


Download ppt "Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr."

Similar presentations


Ads by Google