Download presentation
Presentation is loading. Please wait.
2
Knowledge Discovery Centre: CityU-SAS Partnership 1 Speakers: Prof Y V Hui, CityU Dr H P Lo, CityU Dr Sammy Yuen, CityU Dr K W Cheng, SAS Institute Mr Steven Parker, Standard Chartered
3
Knowledge Discovery Centre: CityU-SAS Partnership 2 The Art and Science of Data Mining Y V Hui City University of Hong Kong
4
Knowledge Discovery Centre: CityU-SAS Partnership 3 The Driving Forces Specialization and focus in business - To satisfy the needs of customers - To improve and develop specific business strategies and processes - Personalization through mass customization
5
Knowledge Discovery Centre: CityU-SAS Partnership 4 The Driving Forces Challenges - local and global competition - distributed business operations - product innovation Technology development Benefit, cost and risk on a product or customer basis
6
Knowledge Discovery Centre: CityU-SAS Partnership 5 Data Mining Also known as knowledge discovery in databases. Data mining digs out valuable information from large and messy data. (Computer scientist’s definition) Data mining is a knowledge discovery process. It’s the integration of business knowledge, people, information, statistics and computing technology.
7
Knowledge Discovery Centre: CityU-SAS Partnership 6 Data Mining is Hot Ten Hottest Job, Time, 22 May, 2000 10 emerging areas of technology, MIT’s Magazine of Technology Review, Jan/Feb, 2001
8
Knowledge Discovery Centre: CityU-SAS Partnership 7 Data Mining Philosophy A powerful enabler of competitive advantage. Data mining is driven from business knowledge. Data mining is about enabling people to discover actionable information about their business. Return of profit isn’t about algorithms
9
Knowledge Discovery Centre: CityU-SAS Partnership 8 Business outlook Industry conditions Product offering Customer analysis Strategic options Competitive actions etc Problem development and management Reporting and evaluations Project design Data collection and preparation Model building Validation Management’s Decision World Interface Data Miner’s Analytical World Scope of Data Mining
10
Knowledge Discovery Centre: CityU-SAS Partnership 9 Project Management Cross-functional team System architecture
11
Knowledge Discovery Centre: CityU-SAS Partnership 10 Successful applications Business transaction - risks and opportunities Customer relationship management - personalization, target marketing Electronic commerce & web - web mining
12
Knowledge Discovery Centre: CityU-SAS Partnership 11 Successful applications Science & engineering Health care Multi-media Others
13
Knowledge Discovery Centre: CityU-SAS Partnership 12 Data Mining Process Understanding of business Problem identification
14
Knowledge Discovery Centre: CityU-SAS Partnership 13 Understanding Your Business Do we have a problem? - What is the current situation? Are there any undesirable situations that need attention? - Are there any conditions, processes, etc, that could be improved? - Are any problems foreseeable that could affect the business? - Are there any potential opportunities that the company may capitalize on? A problem is a learning opportunity
15
Knowledge Discovery Centre: CityU-SAS Partnership 14 Understanding Your Problem Operational or analytical Convention rule or knowledge discovery Product based or customer based Market research or data mining Ownership of the information Privacy Added value
16
Knowledge Discovery Centre: CityU-SAS Partnership 15 Data Mining Process Collecting relevant information Understanding of business Problem identification
17
Knowledge Discovery Centre: CityU-SAS Partnership 16 Collecting Relevant Information Data Search Data Collection Data Preparation Data Mining Database
18
Knowledge Discovery Centre: CityU-SAS Partnership 17 Data Search Exploring the problem space. Don’t let the data drive the problem. Measurement Exploring the data sources
19
Knowledge Discovery Centre: CityU-SAS Partnership 18 Data Collection Data retrieval Data audit Data set assembly and data warehouse Survey
20
Knowledge Discovery Centre: CityU-SAS Partnership 19 Data Preparation Data representation Data exploration Data normalization Data transformation Imputation of missing data Data tuning
21
Knowledge Discovery Centre: CityU-SAS Partnership 20 Data Mining Database Variable selection Record selection Data set partition
22
Knowledge Discovery Centre: CityU-SAS Partnership 21 Data Mining Process Collecting relevant informationModel building Understanding of business Problem identification Learning
23
Knowledge Discovery Centre: CityU-SAS Partnership 22 Model Building Model based vs non-model based y 1,y 2,…,y p =f(x 1, …, x q ) x 1, …, x q y 1, …, y p InputsOutputs
24
Knowledge Discovery Centre: CityU-SAS Partnership 23 Model Building Parametric vs nonparametric
25
Knowledge Discovery Centre: CityU-SAS Partnership 24 Model Building Estimation vs trial and error Directed vs undirected Multidimensional analysis Large data set vs small data set
26
Knowledge Discovery Centre: CityU-SAS Partnership 25 Data Mining Algorithms Online Analytical Processing Discovery Driven Methods SQL Query Tools DescriptionPrediction ClassificationRegressions Decision Trees Neural Networks Visualization Clustering Association Sequential Analysis
27
Knowledge Discovery Centre: CityU-SAS Partnership 26 Online Analytical Processing Query and reporting Example of SQL query: How many credit-card customers who made purchases of over $1,000 on sporting goods in December have at least $20,000 of available credit? Manual and validation driven
28
Knowledge Discovery Centre: CityU-SAS Partnership 27 Estimation and Prediction Statistical models Neural network Example: Housing price valuation model
29
Knowledge Discovery Centre: CityU-SAS Partnership 28 Classification Algorithms Statistical techniques Neural networks Genetic algorithms Nearest neighbor method Rule induction and decision tree Example: Customer segmentation and buying behavior description
30
Knowledge Discovery Centre: CityU-SAS Partnership 29 Association Rules Apriori algorithm Example: Market basket analysis, cross selling analysis
31
Knowledge Discovery Centre: CityU-SAS Partnership 30 Sequential Analysis Count-all algorithm Count-some algorithm Example: Attached mailing, add-on sales
32
Knowledge Discovery Centre: CityU-SAS Partnership 31 Algorithms Comparison No single data mining algorithm can outperform any other. Try different algorithms and draw conclusions from the results. Use your business knowledge. Neural networks do no better than statistical models when the underlying structure is known. However, neural networks detect hidden interactions and nonlinearity. Use the prior information if available.
33
Knowledge Discovery Centre: CityU-SAS Partnership 32 Algorithms Comparison Data mining algorithms cannot handle dependent records. Use the prior information. Statistical models help. Data tuning and dimension reduction enhance data mining before and after the analysis. Statistical techniques help.
34
Knowledge Discovery Centre: CityU-SAS Partnership 33 Data Mining Process Collecting relevant dataModel building Understanding of business Problem identification Business strategy and evaluation Learning Action
35
Knowledge Discovery Centre: CityU-SAS Partnership 34 Trends that Effect Data Mining Data trends - data explosion - data types
36
Knowledge Discovery Centre: CityU-SAS Partnership 35 Trends that Effect Data Mining Hardware trends - memory - processing speed - storage
37
Knowledge Discovery Centre: CityU-SAS Partnership 36 Trends that Effect Data Mining Network trends - network connectivity - distributed databases Wireless communication
38
Knowledge Discovery Centre: CityU-SAS Partnership 37 Trends that Effect Data Mining Scientific computing trends - theory, experiment and simulation
39
Knowledge Discovery Centre: CityU-SAS Partnership 38 Trends that Effect Data Mining Business trends - total quality management, - customer relationship management, - business process reengineering, - enterprise resources planning, - supply chain management, - business intelligence and knowledge management, - e – business and m – business
40
Knowledge Discovery Centre: CityU-SAS Partnership 39 Trends that Effect Data Mining Privacy and Security
41
Knowledge Discovery Centre: CityU-SAS Partnership 40 Pot of Gold The benefits of knowing one’s business and customers become so critical that technologies are coming together to support data mining. Data mining is not a cybernetic magic that will turn your data into gold. It’s the process and result of knowledge production, knowledge discovery and knowledge management.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.