Download presentation
Presentation is loading. Please wait.
Published byMeredith Reeves Modified over 9 years ago
1
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation Cabena and et al. Prentice Hall 3.Data Mining, 2000 Concept and Techniques Jiawei Han and Micheline Kamber Morgan Kaufmann
2
2 Proceedings 1.Proceedings of the International Conference on Data Mining (ICDM) 2.Proceedings of the International Conference on Data Engineering (ICDE) 3.Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 4.Proceedings of the International Conference on Very Large Data Bases (VLDB) 5.Proceedings of ACM SIGMOD International Conference on Management of Data 6.Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA) 7.Proceedings of the International Conference on Database and Expert Systems Applications (DEXA)
3
3 8.Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWak) 9.Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 10.European Conference on Principles of Data Mining and Knowledge Discovery (PKDD ) Journals 1.IEEE Transactions on Knowledge and Data Engineering (TKDE) 2.Journal of Intelligent Information Systems 3.Data Mining and Knowledge Discovery 4.ACM SIGMOD Record 5.The International Journal on Very Large Database 6.Knowledge and Information Systems 7.Data & Knowledge Engineering 8.International Journal of Cooperative Information Systems
4
4 Outline Introduction Knowledge Discovery in Databases (KDD) Data Mining and Query Tools Basic Data Mining Techniques Data Mining and Data Warehouse Association Rules
5
5 A short story The library of Babel (infinite) Books must be somewhere in the library People wander round this library until they die The library contains an infinite amount of data but no information Today’s environment Too much data but too little information Challenge Find the required information from huge amounts of data The amount of data is growing increasingly difficult to find the meaningful information
6
6 Knowledge Discovery in Database (KDD) The whole process of extraction of implicit, previously unknown and potentially useful knowledge as a production factor from a large data sets Include data selection, cleaning, coding, data mining, and reporting Data Mining The key stage of Knowledge Discovery in Database (KDD) The process of finding the desired information from large database
7
7 KDD is not a new technique but rather a multi- disciplinary field of research
8
8 AI, machine learning (1950) It is extremely difficult to create computer that has an intelligent close to that of human beings Lack of creativity and self-learning 1960: stop researching about learning Neural network fail (XOR) 1980 ~: neural network changes architecture, new machine learning algorithm (decision tree, genetic algorithm, etc.), powerful computer, focus on simple and practical problem
9
9 Why learning Even for simple problem, such as timetable planning extremely hard to solve with a computer but easily solved by experienced human Using expert system to solve problem Even for simple systems, a great many rules existed. It is difficult to find the right rules. Need to interview relevant experts many times and integrate them to obtain the expert knowledge Knowledge acquisition: using learning algorithms to generate rules automatically
10
10 Why interest in data mining In the 1980s, all organizations begin to build database. Until now, they contain gigabytes of data with much ‘hidden’ information that cannot easily be traced using SQL SQL is just a query language under the constraints that you already know As the use of networks, it will become increasingly easy to connect database Discover more information Machine learning techniques have been improved Easier to find interesting information Client/server environment Electronic commerce
11
11 Data mining tool & Query tool Suppose a large database containing millions of records that describe customers’ purchases Who bought which product on what date? What is the average turnover in July? What is an optimal segmentation of clients What are the most important trends in customer behavior If you know exactly what you are looking for, use query tool If you know only vaguely what you are looking for, use data mining tool
12
12 Data mining in electronic commerce The success of KDD come primarily from marketing Prediction Customer buying baby clothes today may buy computer games in ten years, and fifteen years later a motorcycle
13
13 Suppose a company keeps the data about what products they bought Mail to everyone only 3% ~ 4% interest Analyze user behavior, and cluster customers according to their interests can save 50% of mailing costs
14
14 The problems of data mining Lack of long-term vision What do we want to get from the database in the future? Not all files are up to date Example: the price of computer Struggle between departments Poor cooperation between users and EDP dept. Legal and privacy restrictions Data model need to be transformed for different data mining technique Timing problems: integrate data from different sources Interpretation problems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.